P. 1
Human Voice

Human Voice


|Views: 1,046|Likes:
Published by Sasanka Sekhar Pani

More info:

Published by: Sasanka Sekhar Pani on Oct 22, 2010
Copyright:Attribution Non-commercial


Read on Scribd mobile: iPhone, iPad and Android.
download as DOCX, PDF, TXT or read online from Scribd
See more
See less






  • Voice modulation in spoken language
  • Vocal resonation
  • Voice disorders
  • History
  • Development
  • Non-native accents
  • Social factors
  • Prestige
  • Accent Discrimination
  • Acting and accents
  • Legal implications
  • Technique
  • Physiology
  • Possible dangers of belting
  • Histoanaomy of the Glottis
  • Layered Structure of the Adult Vocal Fold
  • Epithelium
  • Basal Lamina or Basement Membrane Zone (BMZ)
  • Superficial Layer of the Lamina Propria
  • The Transition:Intermediate and Deep Layers of the Lamina Propria
  • The Body: The Thyroarytenoid Muscle
  • Vocal Fold Lesions
  • Reinke¶s Edema
  • Histological Changes From Birth to Old Age
  • Pediatrics
  • Adulthood
  • Old Age
  • Noise levels
  • Quantity to be measured Unit of measurement Good values
  • Intelligibility with different types of speech
  • Lombard speech
  • Screaming
  • Clear speech
  • Infant-directed speech
  • Citation speech
  • Hyperspace speech
  • Private loop
  • Public loop
  • Neurology
  • Choral singing
  • Animal vocalization
  • Stricture
  • Other parameters
  • Individual manners
  • Perspectival aspects
  • Organic aspects
  • Expressive aspects
  • Linguistic aspects
  • Verbal vs. oral communication
  • Proxemics: physical space in communication
  • Chronemics: time in communication
  • Posture
  • Gesture
  • Haptics: touching in communication
  • Concealing deception
  • The relative importance of verbal and nonverbal communication
  • Interaction of verbal and nonverbal communication
  • Phonation
  • Non-phonemic
  • phonation
  • Myoelastic and aerodynamic theory
  • Neurochronaxic theory
  • Mazatec
  • breathy voice [ja ] he wears
  • modal voice [já] tree
  • Open glottis [t] voiceless (full airstream)
  • Glottal consonants
  • Voice modal breathy harsh faucalized
  • Pubic hair
  • Voice change
  • Male musculature and body shape
  • Body odor and acne
  • Menstruation and fertility
  • Genetic influence and environmental factors
  • Variations of sequence
  • Conclusion
  • Endocrine perspective
  • Hormonal changes in boys
  • Hormonal changes in girls
  • Electronic devices
  • Concatenative synthesis
  • Formant synthesis
  • Articulatory synthesis
  • HMM-based synthesis
  • Sinewave synthesis
  • Text normalization challenges
  • Text-to-phoneme challenges
  • Evaluation challenges
  • Atari
  • Apple
  • Microsoft Windows
  • Android
  • Internet
  • Others
  • Breathing
  • Range and Tone
  • Pedagogical philosophy
  • The nature of vocal sounds
  • Respiration
  • Resonation
  • Breathing and breath support
  • Voice classification
  • Female voices
  • Male voices
  • Vocal registration
  • Coordination
  • General music studies
  • Performance skills and practices
  • Soprano
  • Contralto
  • Countertenor
  • Tenor
  • Baritone
  • Bass
  • The voice from childhood to adulthood



Accent (Linguistics) Acoustic Phonetics Belt (Music) Histology Of Vocal Folds Intelligibility (Communication) Lombard Effect Manner Of Articulation Paralanguage: Nonverbal Voice Cues In Communication Phonation Phonetics Voice Change In Boys Speaker Recognition Speech Synthesis Vocal Loading Vocal Rest Vocal Range Vocal Warm Up Vocology Voice Analysis Voice Disorders Voice Frequency Voice Organ Voice Pedagogy Voice Projection Voice Synthesis Voice Types (Singing Voices) Use Of The Web By People With Disabilities

Human Voice
The human voice consists of sound made by a human being using the vocal folds for talking, singing, laughing, crying,screaming, etc. Human voice is specifically that part of human sound production in which the vocal folds (vocal cords) are the primary sound source. Generally speaking, the mechanism for generating the human voice can be subdivided into three parts; the lungs, the vocal folds within the larynx, and the articulators. The lung (the pump) must produce adequate airflow and air pressure to vibrate vocal folds (this air pressure is the fuel of the voice). The vocal folds (vocal cords) are a vibrating valve that chops up the airflow from the lungs into audible pulses that form the laryngeal sound source. The muscles of the larynx adjust the length and tension of the vocal folds to µfine tune¶ pitch and tone. The articulators (the parts of the vocal tract above the larynx consisting of tongue, palate, cheek, lips, etc.) articulate and filter the sound emanating from the larynx and to some degree can interact with the laryngeal airflow to strengthen it or weaken it as a sound source. The vocal folds, in combination with the articulators, are capable of producing highly intricate arrays of sound. The tone of voice may be modulated to suggest emotions such as anger, surprise, or happiness. Singers use the human voice as an instrument for creating music. Voice types and the folds (cords) themselves Adult men and women have different vocal folds sizes; reflecting the male-female differences in larynx size. Adult male voices are usually lower-pitched and have larger folds. The male vocal folds (which would be measured vertically in the opposite diagram), are between 17 mm and 25 mm in length. the female vocal folds are between 12.5 mm and 17.5 mm in length.

A labeled anatomical diagram of the vocal folds or cords.


As seen in the illustration, the folds are located just above the vertebrate trachea (the windpipe, which travels from the lungs). Food and drink do not pass through the cords but instead pass through the esophagus, an unlinked tube. Both tubes are separated by the epiglottis, a "flap" that covers the opening of the trachea while

The folds in both sexes are within the larynx. They are attached at the back (side nearest the spinal cord) to the arytenoids cartilages, and at the front (side under the chin) to the thyroid cartilage. They have no outer edge as they blend into the side of the breathing tube (the illustration is out of date and does not show this well) while their inner edges or "margins" are free to vibrate (the hole). They have a three layer construction of anepithelium, vocal ligament, then muscle (vocalis muscle), which can shorten and bulge the folds. They are flat triangular bands and are pearly white in color. Above both sides of the vocal cord is the vestibular fold or false vocal cord, which has a small sac between its two folds (not illustrated). The difference in vocal folds size between men and women means that they have differently pitched voices. Additionally, genetics also causes variances amongst the same sex, with men and women's singing voices being categorized into types. For example, among men, there are bass, baritone, tenor and countertenor(ranging from E2 to even F6), and among women, contralto, mezzo-soprano and soprano (ranging from F3 to C6). There are additional categories for operatic voices, see voice type. This is not the only source of difference between male and female voice. Men, generally speaking, have a larger vocal tract, which essentially gives the resultant voice a lower-sounding timbre. This is mostly independent of the vocal folds themselves. Voice modulation in spoken language Human spoken language makes use of the ability of almost all persons in a given society to dynamically modulate certain parameters of the laryngeal voice source in a consistent manner. The most important communicative, or phonetic, parameters are the voice pitch (determined by the vibratory frequency of the vocal folds) and the degree of separation of the vocal folds, referred to as vocal fold abduction (coming together) or adduction (separating). The ability to vary the ab/adduction of the vocal folds quickly has a strong genetic component, since vocal fold adduction has a life-preserving function in keeping food from passing into the lungs, in addition to the covering action of the epiglottis. Consequently, the muscles that control this action are among the fastest in the body. Children can learn to use this action consistently during speech at an early age, as they learn to speak the difference between utterances such as "apa" (having an abductory-adductory gesture for the p) as "aba" (having no abductory-adductory gesture). Surprisingly enough, they can learn to do this well before the age of two by listening only to the voices of adults around them who have voices much different from their own, and even though the laryngeal movements causing these phonetic differentiations are deep in the throat and not visible to them. If an abductory movement or adductory movement is strong enough, the vibrations of the vocal folds will stop (or not start). If the gesture is abductory and is part of a speech sound, the sound will be called Voiceless. However, voiceless speech sounds are sometimes better identified as containing an abductory gesture, even if the gesture was not strong enough to stop the vocal folds from vibrating. This anomalous feature of voiceless speech sounds is better understood if it is realized that it is the change in the spectral qualities of the voice as abduction proceeds that is the primary acoustic attribute that the listener attends to when identifying a voiceless speech sound, and not simply the presence or absence of voice (periodic energy). An adductory gesture is also identified by the change in voice spectral energy it produces. Thus, a speech sound having an adductory gesture may be referred to as a "glottal stop" even if the vocal fold vibrations do not entirely stop. for an example illustrating this, obtained by using the inverse filtering of oral airflow.] Other aspects of the voice, such as variations in the regularity of vibration, are also used for communication, and are important for the trained voice user to master, but are more rarely used in the formal phonetic code of a spoken language. Physiology and vocal timbre

and over which breath can be transferred at varying pressures. the modal register. a register language is a language that combines tone and In linguistics. and possessing the same quality. Vocal registration Vocal registration refers to the system of vocal registers within the human voice. Various terms related to the resonation process include amplification. timbre.The sound of each individual's voice is entirely unique not only because of the actual shape and size of an individual's vocal cords but also due to the size and shape of the rest of that person's body. intensification. the term register can be somewhat confusing as it encompasses several aspects of the human voice. or should be. the tracheal tree. a certain series of pitches.  A phonatory process  A certain vocal timbre  A region of the voice that is defined or delimited by vocal breaks. and the manner in which the speech sounds are habitually formed and articulated. Each of these vibratory patterns appears within a particular Vocal range range of pitches and produces certain characteristic sounds. The shape of chest and neck. which people can manipulate in different ways to produce different sounds. especially the vocal tract. Another major influence on vocal sound and production is the function of the larynx. Sound also resonates within different parts of the body. tighten. Speech pathologists identify four vocal registers based on the physiology of laryngeal function: the vocal fry register. Within speech pathology the term vocal register has three constituent elements: a certain vibratory pattern of the vocal folds. The primary method for singers to accomplish this is through the use of the Singer's Formant.) Humans have vocal folds that can loosen. volume. and prolongation. the oral cavity. Registers originate in laryngeal functioning. the position of the tongue. to make a better sound. There are seven areas that maybe listed as possible vocal resonators. and the sinuses. produced in the same vibratory pattern of the vocal folds. Any one of these actions results in a change in pitch. (It is this latter aspect of the sound of the voice that can be mimicked by skilled performers. middle. The term register can be used to refer to any of the following:  A particular part of the vocal range such as the upper. these areas are the chest. This is known as vocal resonation. Influences of the human voice . the nasal cavity. and an individual's size and bone structure can affect somewhat the sound produced by an individual. The main point to be drawn from these terms by a singer or speaker is that the end result of resonation is. Vocal resonation Vocal resonation is the process by which the basic product of phonation is enhanced in timbre and/or intensity by the air-filled cavities through which it passes on its way to the outside air. improvement. They occur because the vocal folds are capable of producing several different vibratory patterns. and the falsetto register. or tone of the sound produced.  A subset of a language used for a particular purpose or in a particular social setting. enrichment. A register in the human voice is a particular series of tones.  A resonance area such as chest voice or head voice. vowel phonation into a single phonological system. enlargement. In sequence from the lowest within the body to the highest. although in strictly scientific usage acoustic authorities would question most of them. the larynx itself. or change their thickness. Singers can also learn to project sound in certain ways so that it resonates better within their vocal tract. and the whistle register. or lower registers. the pharynx. and a certain type of sound. This view is also adopted by many vocal pedagogists. and the tightness of otherwise unrelated muscles can be altered. which has been shown to be a resonance added to the normal resonances of the vocal tract above the frequency range of most instruments and so enables the singer's voice to carry better over musical accompaniment. These different kinds of laryngeal function are described as different kinds of vocal registers.

The range of the human voice is quite astounding. their first language (when the language in which the accent is heard is not their native language). alone. syntax. A man's voice ranges from bass to tenor. Dialects are usually spoken by a group united by geography or social status. and all in co-operation produce the number we have named . stresses and peculiarities develop.741. often an ENT specialist may be able to help. and so on. hence phonography is much the shortest and simplest mode of short-hand writing. It is difficult to measure or predict how long it takes an accent to formulate. but the best treatment is the prevention of injuries through good vocal production. and growths and lesions on the vocal folds. the medium being whs called a barytone. an accent is a manner of pronunciation of a language. their caste or social class. Human (Range Of The). the interaction of people from many ethnic backgrounds contributed to the formation of the different varieties of North American accents. Canada and Australia. according to a study published by the New Scientist. It differs from stenography in this respect: .whereas. ditto. and morphology.method of writing by signs that represent the sounds of the language. Over time these can develop into identifiable accents. The female voice ranges from contral o to soprano. 173. their ethnicity. History As human beings spread out into isolated communities. the socio-economic status of its speakers.592. a boy's Voice is alto. In linguistics. Phonography includes every. Accents in the USA. instead of their sound. but 17. independently of different degrees of intensity.383 . . and these.The twelve-tone musical scale. may have its roots in the sound of the human voice during the course of evolution.044. Analysis of recorded speech samples found peaks in acoustic energy that mirrored the distances between notes in the twelve-tone scale. Voice therapy is generally delivered by a Speechlanguage pathologist. as well as pronunciation. An accent may be associated with the region in which its speakers reside (ageographical or regional accent). . Hoarseness or breathiness that lasts for more than two weeks is a common symptom of an underlying voice disorder and should be investigated medically Range Of The Human Voice Voice. 30 indirect muscles. Phonography Phonography. which is stress inflicted on the speech organs.823. Accent (linguistics) .Stenography uses characters to representwords by their spelling. or between a tenor and a treble.515 different sounds. Talking for improperly long periods of time causesvocal loading. Voice disorders There are many disorders that affect the human voice. upon which some of the music in the world is based. thus 14 direct muscles. developed from the .186. In North America. the medium being tinned a mezzo-soprano. these include speech impediments. When vocal injury is done. produce 16.there being about 9 perfect tones. Accents can be confused with dialects which are varieties of language differing in vocabulary. for example. or together.

Development Children are able to take on accents relatively quickly. Accents seem to remain relatively malleable until a person's early twenties. Non-native accents Pronunciation is the most difficult part of a non-native language to learn. All the same. yet North American accents remain more distant. Most individuals who speak a non-native language fluently speak it with an accent of their native tongue. though both children and parents may have a noticeable non-native accent. acquiring a native-like accent in a non-native language is near impossible. An acoustic analysis by Jonathan Harrington of Queen Elizabeth II's Royal Christmas Messages revealed that the speech patterns of even so conservative a figure as a monarch can continue to change over her lifetime. is quite controversial among researchers. speakers who deviate from it are often said to "speak with an accent". Children of immigrant families. Most researchers agree that for adults. for example. Scottish and Welsh immigrants had accents which greatly affected the vowel pronunciation of certain areas of Australia and Canada. Prestige Certain accents are perceived to carry more prestige in a society than other accents. aesthetics. All languages and accents are linguistically equal. such as the Italian accent. and vice versa. there is no differentiation among accents in regards to their prestige. or correctness. after which a person's accent seems to become more entrenched. in linguistics. Accents such as BBC English or General American or Standard American may sometimes be erroneously designated in their countries of origin as "accentless" to indicate that they offer no obvious clue to the speaker's regional or social background. The critical period theory states that if learning takes place after the critical period (usually considered around puberty) for acquiring native-like pronunciation.combinations of different accents and languages in various societies. The most important factor in predicting the degree to which the accent will be noticeable (or strong) is the age at which the non-native language was learned. People from theUnited States would "speak with an accent" from the point of view of an Australian. an individual is unlikely to acquire a nativelike accent. and the effect of this on the various pronunciations of the British settlers. Accent Stereotyping and Prejudice . neurological constrains associated with brain development appear to limit most non-native speakers¶ ability to sound native-like. Although many subscribe to some form of the critical period. they either place it earlier than puberty or consider it more of a critical ³window. This theory. similarity of the non-native language to the native language. However. the accents of non-English settlers from Great Britain and Ireland affected the accents of the different colonies quite differently. children as young as 6 at the time of moving to another country often speak with a noticeable non-native accent as adults. everyone speaks with an accent. generally have a more native-like pronunciation than their parents. In many cases. However. accents are not fixed even in adulthood. Irish.However.´ which may vary from one individual to another and depend on factors other than age. Nevertheless. either as a result of time or of external or "foreign" linguistic interaction. Received Pronunciation of the English language is associated with the traditional upper class. and the frequency with which both languages are used. For example in the United Kingdom. This is often due to their association with the elite part of society. Social factors When a group defines a standard pronunciation. such as length of residence. There are also rare instances of individuals who are able to pass for native speakers even if they learned their non-native language in early adulthood. however.

On average. . Speakers with accents often experience discrimination in housing and employment. Similarly. students taught by non-native English speaker do not underperform when compared to those taught by native speakers of English. In accent discrimination. traits. Rosina Lippi-Green writes. Negative evaluations may reflect the prejudices rather than real issues with understanding accents. however. students listened to a taped lecture recorded by the same native English speaker with a standard accent. In business settings. Researchers consistently show that people with accents are judged as less intelligent. and excuse to turn away. Studies have shown the perception of the accent. ethnicity. Gary Oldman has become known for playing eccentrics and for his mastery of accents. individuals with nonstandard accents are more likely to evaluated negatively. non-native speaking graduate students.[22][23] For example. Accent Discrimination Discrimination refers to specific behaviors or actions directed at a group or its individual members based solely on the group membership. Missouri-born actor Dick van Dyke attempted to imitate a cockney accent in the film Mary Poppins. there are no strong norms against accent discrimination in the general society. In a study conducted by Rubin (1992). having poor English/language skills. The perception or sensitivity of others to accents means that generalizations are passed off as acceptable. Stereotypes can be both positive and negative. and perhaps by a prevailing sense of what is morally and ethically right. less competent. one's way of speaking is used as a basis for arbitrary evaluations and judgments. and unpleasant to listen to. Angelina Jolie attempted a Greek accent in the film Alexander that was said by critics to be distracting. less educated. across college campuses in the US have been target for being unintelligible because of accent. which is defined as having negative attitudes toward a group and its members. although negative are more common. by law and social custom. an actor may portray a character of some nationality other than his or her own by adopting into the native language the phonological profile typical of the nationality to be portrayed ± what is commonly called "speaking with an accent". and roles that a group and its members are believed to possess. We have no such compunctions about language. such as Brad Pitt's Jamaican accent in Meet Joe Black. Accent serves as the first point of gate keeping because we are forbidden. they were shown a picture of the lecturer who was either a Caucasian or Asian. Thus. lecturers. landlords are less likely to call back speakers who have foreign or ethnic accents and are more likely to be assigned by employers to lower status positions than are those with standard accents. but individuals with accent also often stereotype against their own or others' accents.Stereotypes refer to specific characteristics. Accent discrimination is also present in educational institutions. homeland or economics more directly. Participants in the study who saw the Asian picture believed that they had heard an accented lecturer and performed worse on a task measuring lecture comprehension. However. [19][20] Not only people with standard accents subscribe to these believes and attitudes. from using race. Individuals with non-standard accents often have to deal with both negative stereotypes and prejudice because of an accent. For example.[21] Unlike other forms of discrimination. One example would be Viggo Mortensen's use of a Russian accent in his portrayal of Nikolai in the movie Eastern Promises. not the accent by itself. Acting and accents Actors are often called upon to speak varieties of language other than their own. and professors. however. often results in negative evaluations of speakers. For example. to recognize the other. accent becomes a litmus test for exclusion. Stereotypes may result in prejudice.

Accents may have associations and implications for an audience. For example, in Disney films from the 1990s onward, English accents are generally employed to serve one of two purposes: slapstick comedy or evil genius. Examples include Aladdin (the Sultan and Jafar, respectively), The Lion King (Zazu and Scar, respectively), The Hunchback of Notre Dame(Victor the Gargoyle and Frollo, respectively), and Pocahontas (Wiggins and Ratcliffe, respectively - both of whom happen to be played by the same actor, American David Ogden Stiers). Legal implications In the United States, Title VII of the Civil Rights Act of 1964 prohibits discrimination based on national origin, implying accents. However, employers can insist that a person¶s accent impairs his or her communication skills that are necessary to the effective business operation and be off the hook. The courts often rely on the employer¶s claims or use judges¶ subjective opinions when deciding whether the (potential) employee¶s accent would interfere with communication or performance, without any objective proof that accent was or might be a hindrance. Kentucky's highest court in the case of Clifford vs. Commonwealth held that a white police officer, who had not seen the black defendant allegedly involved in a drug transaction, could, nevertheless, identify him as a participant by saying that a voice on an audiotape "sounded black." The police officer based this "identification" on the fact that the defendant was the only African American man in the room at the time of the transaction and that an audio-tape contained the voice of a man the officer said ³sounded black´ selling crack cocaine to a white informant planted by the police.

Acoustic phonetics Acoustic phonetics is a subfield of phonetics which deals with acoustic aspects Acoustic phonetics investigates properties like the mean of speech sounds. squared amplitude of awaveform, its duration, its fundamental frequency, or other properties of its frequency spectrum, and the relationship of these properties to other branches of phonetics (e.g. articulatory orauditory phonetics), and to abstract linguistic concepts like phones, phrases, or utterances. The study of acoustic phonetics was greatly enhanced in the late 19th century by the invention of the Edison phonograph. The phonograph allowed the speech signal to be recorded and then later processed and analyzed. By replaying the same speech signal from the phonograph several times, filtering it each time with a different band-pass filter, a spectrogram of the speech utterance could be built up. A series of papers by Ludimar Hermann published in Pflüger's Archiv in the last two decades of the 19th century investigated the spectral properties of vowels and consonants using the Edison phonograph, and it was in these papers that the term formant was first introduced. Hermann also played back vowel recordings made with the Edison phonograph at different speeds to distinguish between Willis' and Wheatstone's theories of vowel production. Further advances in acoustic phonetics were made possible by the development of the telephone industry. (Incidentally, Alexander Graham Bell's father, Alexander Melville Bell, was a phonetician.) During World War II, work at the Bell Telephone Laboratories (which invented the spectrograph) greatly facilitated the systematic study of the spectral properties of periodicand aperiodic speech sounds, vocal tract resonances and vowel formants, voice quality, prosody, etc. On a theoretical level, acoustic phonetics really took off when it became clear that speech acoustic could be modeled in a way analogous to electrical circuits. Lord Rayleigh was among the first to recognize that the new electric theory could be used in acoustics, but it was not until 1941 that the circuit model was effectively used, in a book by Chiba and Kajiyama called "The Vowel: Its Nature and Structure". (Interestingly, this book by Japanese authors working in Japan was published in English at the height of World War II.)

In 1952, Roman Jakobson,Gunnar Fant, and Morris Halle wrote "Preliminaries to Speech Analysis", a seminal work tying acoustic phonetics and phonological theory together. This little book was followed in 1960 by Fant "Acoustic Theory of Speech Production", which has remained the major theoretical foundation for speech acoustic research in both the academy and industry. (Fant was himself very involved in the telephone industry.) Other important framers of the field include Kenneth N. Stevens, Osamu Fujimura, and Peter Ladefoged. Belt (music) Belting (or vocal belting) refers to a specific technique of singing by which a singer produces a loud sound in the upper middle of the pitch range. It is often described as a vocal registeralthough some dispute this since technically the larynx is not oscillating in a unique way . Singers can use belting to convey heightened emotional states . Technique The term "belt" is sometimes mistakenly described as the use of chest voice in the higher part of the voice. (The chest voice is a very general term for the sound and muscular functions of the speaking voice, singing in the lower range and the voice used to shout. Still, all those possibilities require help from the muscles in the vocal folds and a thicker closure of the vocal folds. The term "chest voice" is therefore often a misunderstanding, as it describes muscular work in the chest-area of the body, but the "sound" described as "chestvoice" is also produced by work of the vocal folds.) However, the proper production of the belt voice according to some vocal methods involves minimizing tension in the throat and change of typical placement of the voice sound in the mouth, bringing it forward into the hard palate. It is possible to learn classical vocal methods like bel canto and to also be able to belt; in fact, many musical roles now require it. The belt sound is easier for some than others, but the sound is possible for classical singers, too. It requires muscle coordinations not readily used in classically trained singers, which may be why some opera singers find learning to belt challenging. In order to increase the number of high notes one can belt, one must practice. This can be by repeatedly attempting to hit the note in a melody line, or by using vocalise programs utilizing scales. Many commercial learn-to-sing packageshave a set of scales to sing along to as their main offering, which the purchaser must practice with often to see improvement. 'Belters' are not exempt from developing a strong head voice, as the more resonant their higher register in head voice, the better the belted notes in this range will be. Some belters find that after a period of time focusing on the belt, the head voice will have improved and, likewise, after a period of time focusing on the head voice, the belt may be found to have improved. Physiology There are many explanations as to how the belting voice quality is produced. When approaching the matter from the Bel Canto point of view, it is said that the chest voice is applied to the higher register However, through studying singers who use a "mixed" sound practitioners have defined mixed sound as belting. One researcher,Jo Estill, has conducted research on the belting voice, and describes the belting voice as an extremely muscular and physical way of singing. When observing the vocal tract and torso of singers, while belting, Estill observed:  Minimal airflow (longer closed phase (70% or greater) than in any other type of phonation)  Maximum muscular engagement of the torso (In Estill terms: Torso anchoring).  Engagement of muscles in the head and neck in order to stabilize the larynx) (in Estill terms: Head and neck anchoring)  A downwards tilt of the cricoid cartilage (An alternative option would be the thyroid tilting backwards. Observations show a larger CT space).  High positioning of the larynx

Maximum muscular effort of the extrinsic laryngeal muscles, minimum effort at the level of the true vocal folds.  Narrowing of the aryepiglottic sphincter (the "twanger") Possible dangers of belting Use of belting without proper coordination can lead to forcing. Forcing can lead consequently to vocal deterioration. Moderate use of the technique and, most importantly, retraction of the ventricular folds while singing is vital to safe belting. Without proper training in retraction, belting can indeed cause trauma to the vocal folds that requires the immediate attention of a doctor. Most tutors and some students of the method known as Speech Level Singing, created and supported by Seth Riggs, regard belting as damaging to long term vocal health. They may teach an alternative using a "mixed" or middle voice which can sound almost as strong, as demonstrated by Aretha Franklin, Patti LaBelle, Celine Dion, Whitney Houston, Mariah Carey,Lara Fabian, Ziana Zain, and Regine Velasquez. The subject of belting is a matter of heated controversy among singers, singing teachers and methodologies. Proponents of belting say that it is a "soft yell," and if produced properly it can be healthy. It does not require straining and they say it is not damaging to the voice. Though the larynx is higher than in classical technique,and many experts on the singing voice believe that a high larynx position is both dangerous to vocal health and produces what many find to be an unpleasant sound. According to master teacher David Jones, "Some of the dangers are general swelling of the vocal cords, pre-polyp swelling, ballooning of capillaries on the surface of the vocal cords, or vocal nodules. A high-larynxed approach to the high voice taught by a speech level singing instructor who does not listen appropriately can lead to one or ALL of these vocal disorders". However, it is thought by some that belting will produce vocal nodules. This may be true if belting is produced incorrectly. If the sound is produced is a mixed head and chest sound that safely approximates a belt, produced well, there may be no damage to the vocal folds. As for the physiological and acoustical features of the metallic voices, a master thesis has drawn the following conclusions:  No significant changes in frequency and amplitude of F1 were observed  Significant increases in amplitudes of F2, F3 and F4 were found  In frequencies for F2, metallic voice perceived as louder was correlated to increase in amplitude of F3 and F4  Vocal tract adjustments like velar lowering, pharyngeal wall narrowing, laryngeal raising, aryepiglottic and lateral laryngeal constriction were frequently found. 

Histology of the vocal folds
Histology is the study of the minute structure, composition, and function of tissues. The histology of the vocal folds is the reason for vocal fold vibration. Histoanaomy of the Glottis The glottis is defined as the true vocal folds and the space between them. It is composed of an intermembranous portion or anterior glottis, and an intercartilaginous portion or posterior glottis. The border between the anterior and posterior glottises is defined by an imaginary line drawn across the vocal fold at the tip of the vocal process of the arytenoid cartilage. The anterior glottis is the primary structure of vocal fold vibration for phonation and the posterior glottis is the widest opening between the vocal folds for respiration. Thus, voice disorders often involve lesions of the anterior glottis. There are gradual changes in stiffness between the pliable vocal fold and hard, hyaline cartilage of the arytenoid. The vocal processes of the arytenoid cartilages form a firm framework for the glottis but are made of elastic cartilage at the tip. Therefore, the vocal process of the arytenoid bends at the elastic cartilage portion during adduction and abduction of the vocal folds. Attachments of the Vocal Fold

Basal Lamina or Basement Membrane Zone (BMZ) This is transitional tissue composed of two zones. In the adult. basal lamina (or basement membrane zone). Posteriorly. is covered with stratified squamous epithelium. This epithelium is five to twenty-five cells thick with the most superficial layer consisting of one to three cells that are lost to abrasion of the vocal folds during the closed phase of vibration. The cover is composed of the epithelium (mucosa). The superficial layer of the lamina propria is a structure that vibrates a great deal during phonation. The lamina densa has a greater density of filaments and is adjacent to the lamina propria. which can then be grouped into three sections as the cover. as well as glycoprotein and glycosaminoglycan. this vibratory portion is connected to the vocal process of the arytenoid cartilage by the posterior macula flava. The basal lamina or BMZ mainly provides physical support to the epithelim through anchoring fibers and is essential for repair of the epithelium. replacing damaged fibers in order to maintain the integrity and elasticity of the vocal fold tissues. and the viscoelasticity needed to support this vibratory function depends mostly on extracellular matrices. and the superficial layer of the lamina propria. If there really is a space. Lubrication of the vocal folds through adequate hydration is essential for normal phonation to avoid excessive abrasion. it is a potential space. Surgery of the vocal folds can disturb this layer with scar tissue. This layered structure of tissues is very important for vibration of the true vocal folds. The transition is composed of the intermediate and deep layers of the lamina propria. Age-related changes in the macula flava influence the fibrous components of the vocal folds and are partially responsible for the differences in the acoustics of the adult and aged voice.The vibratory portion of the vocal fold in the anterior glottis is connected to the thyroid cartilage anteriorly by the macula flava and anterior commissure tendon. the transition. or Broyle's ligament. collagenous and elastic fibers. These fibers serve as scaffolds for structural maintenance. and the body. the purpose of which is to maintain the shape of the vocal fold. The macula flava in newborn vocal folds is important for the growth and development of the vocal ligament and layered structure of the vocal folds. which will in turn impact lubrication of the vocal folds. Like the pleural cavity. The body is composed of the thyroarytenoid muscle. the macula flavae are probably required for metabolism of the extracellular matrices of the vocal fold mucosa. which can result in the inability of the epithelium to retain an adequate mucous coat. These fibers run . the anterior glottis. The lamina lucida appears as a low density clear zone medial to the epithelial basal cells. The primary extracellular matrices of the vocal fold cover are reticular. there is a problem. The epithelium has been described as a thin shell. Layered Structure of the Adult Vocal Fold The histological structure of the vocal fold can be separated into 5 or 6 tissues. depending on the source. On the surfaces of the epithelial cells are microridges and microvilli. providing tensile strength and resilience so that the vocal folds may vibrate freely but still retain their shape. the lamina lucida and lamina densa. The Cover Epithelium The free edge of the vibratory portion of the vocal fold. The Transition:Intermediate and Deep Layers of the Lamina Propria The intermediate layer of the lamina propria is primarily made up of elastic fibers while the deep layer of the lamina propria is primarily made up of collagenous fibers. Superficial Layer of the Lamina Propria This layer consists of loose fibrous components and extracellular matrices that can be compared to soft gelatin. This layer is also known as Reinke¶s space but it is not a space at all. The posterior glottis is covered with pseudostratified ciliated epithelium. and the microridges and microvilli help to spread and retain a mucous coat on the epithelium.

voice change is controlled by sex hormones. causing vocal fold injury. Puberty Puberty usually lasts from 2±5 years. The vocal ligament begins to be present in children at about four years of age. and the mature lamina propria. Reinke¶s Edema A voice pathology called Reinke¶s edema. The squamous mucosa also differentiates into three distinct layers (the lamina propria) on the free edge of the vocal folds. intermediate and deep layers. The thyroid hormones also affect dynamic function of the vocal folds (Hashimoto¶s Thyroiditis affects the fluid balance in the vocal folds). Two layers appear in the lamina propria between the ages of six and twelve. For women. which increase the mass and thickness of the cover. During puberty. the voice is three tones lower than the child¶s and has five to twelve formants. the actions of estrogens and progesterone produce changes in the extravascular spaces by increasing capillary permeability which allows the passage of intracapillary fluids to the interstitial space as well as modification of glandular secretions. In females. usually seen as nodules or polyps. or cover. The greater mass of the vocal folds due to increased fluid lowers thefundamental frequency (F°) during phonation. but remains very supple and narrow. depending on the source. If a person has a phonotrauma or habitual vocal hyperfunction. swelling due to abnormal accumulation of fluid. the thyroarytenoid muscle. as opposed to the pediatric voice with three to six. giving the vocal fold support as well as providing adhesion between the mucosa. The infant vocal fold is half membranous or anterior glottis. In females during puberty. Estrogens have a hypertrophic and proliferative effect on mucosa by reducing the desquamating effect on the superficial layers. is only present by the conclusion of adolescence. this presence or absence of tissue layers influences a difference in the number of formants between the adult and pediatric populations. The squamous cell epithelium of the anterior glottis are also a frequent site of layrngeal cancer caused by smoking. also known as pressed phonation. Pediatrics The infant lamina propria is composed of only one layer. and typically occurs between the ages of 12 to 17. as compared to three in the adult. The adult fold is approximately three-fifths membranous and two-fifths cartilaginous. and the body. The sub. The transition layer is primarily structural. and there is no vocal ligament. the vocal muscle thickens slightly.roughly parallel to the vocal fold edge and these two layers of the lamina propria comprise the vocal ligament. As vocal fold vibration is a foundation for vocal formants. Histological Changes From Birth to Old Age The histologic structure of the vocal fold differs from the pediatric to the adult and old-age populations. Vocal Fold Lesions The majority of vocal fold lesions primarily arise in the cover of the folds. with the superficial. occurs in the superficial lamina propria or Reinke¶s space. this is a common site for injury. Progesterone has an anti-proliferative effect on mucosa and accelerates desquamation. The length of the vocal fold at birth is approximately six to eight millimeters and grows to its adult length of eight to sixteen millimeters by adolescence. It causes a menstrual-like cycle in the vocal fold epithelium and a drying out . This causes the vocal fold mucosa to appear floppy with excessive movement of the cover that has been described as looking like a loose sock. the proteins in the basal lamina can shear. The Body: The Thyroarytenoid Muscle This muscle is variously described as being divided into the thyroarytenoid and vocalis muscles or the thyrovocalis and the thyromuscularis . Since the basal lamina secures the epithelium to the superficial layer of the lamina propria with anchoring fibers. and half cartilaginous or posterior glottis.and supraglottic glandular mucosa becomes hormone-dependent to estrogens and progesterone.

comprehensibility. Intelligibility is affected by spoken clarity. Adulthood There is a steady increase in the elastin content of the lamina propria as we age (elastin is a yellow scleroprotein. In women. the vocal folds lengthen and become rounded. the essential constituent of the elastic connective tissue) resulting in a decrease in the ability of the lamina propria to expand caused by cross-branching of the elastin fibers. Among other things. 1997). lower sound:noise ratios are rarely acceptable (Moore. The vocalis muscle atrophies in both men and women. they cause a hypertrophy of striated muscles with a reduction in the fat cells in skeletal muscles. the vocal fold cover thickens with aging. an androgen secreted by the testes. the average speech level should exceed that of an interfering noise by 6dB. The deep layer of the lamina propria of the male vocal fold thickens because of increased collagen deposits. In muscles. Testosterone. The thyroid prominence appears. Word articulation remains high even when only 1±2% of the wave is unaffected by distortion: Quantity to be measured Unit of measurement %ALcons C50 STI (RASTI) Articulation loss (popular in USA) Intelligibility (international known) Good values < 10 % > 0. Such speech has increased intelligibility compared to normal speech. and the epithelium thickens with the formation of three distinct layers in the lamina propria. The intermediate layer of the lamina propria tends to atrophy only in men. that a band of frequencies from 1000Hz to 2000Hz is sufficient (sentence articulation score of about 90%). the vocal fold undergoes considerable sex-specific changes. they are essential to male sexuality. In the female larynx. the majority of elderly patients with voice disorders have disease processes associated with aging rather than physiologic aging alone. In aging. or the degree to which speech can be understood. Androgens are the most important hormones responsible for the passage of the boy-child voice to man voice. explicitness. speech is quite resistant to many types of masking frequency cut-off²Moore reports. and the change is irreversible.6 Clarity index (widespread in Germany) > 3 dB Intelligibility with different types of speech Lombard speech The human brain automatically changes speech made in noise through a process called the Lombard effect. Manifesting in a wide frequency range. for example. Intelligibility (communication) In phonetics. perspicuity. The superficial layer of the lamina propria loses density as it becomes more edematous. Noise levels For satisfactory communication. androgens are secreted principally by the adrenal cortex and the ovaries and can have irreversible masculinizing effects if present in high enough concentration. lucidity. It is not only louder but the frequencies of its phonetic fundamental are increased and the . thus trapping the extracellular fluid out of the capillaries and causing tissue congestion. Old Age There is a thinning in the superficial layer of the lamina propria in old age. and a reduction in the whole body fatty mass. In men. However. will cause changes in the cartilages and musculature of the larynx for males during puberty. Intelligibility is a measure of how comprehendible speech is.of the mucosa with a reduction in secretions of the glandular epithelium. this leads to the mature voice being better suited to the rigors of opera. and precision. Progesterone has a diuretic effect and decreases capillary permeability.

durations of its vowels are prolonged. Research upon Great tits and Beluga whales that live in environments with noise pollution finds that the effect also occurs in the vocalizations of nonhuman animals. shortening of nuclear vowels. more and longer pauses. also known as the hyperspace effect. exaggerated pitch range. Infant-directed speech Infant-directed speech²or Baby talk²uses a simplified syntax and a small and easier-tounderstand vocabulary than speech directed to adults Compared to adult directed speech. The effect was discovered in 1909 by Étienne Lombard. and a number of phonological changes (including fewer reduced vowels and more released stop bursts). Due to the Lombard effect. This change includes not only loudness but also other acoustic features such as pitch and rate and duration of sound syllables.g. This compensation effect results in an increase in the auditory signal-tonoise ratio of the speaker¶sspoken words. listeners hear speech recorded in noise better compared to that speech which has been recorded in quiet and then played given with the same level of masking noise. occurs when people are misled about the presence of environment noise. devoicing of word-final consonants) than normal speech. elevated speech intensity.. Great tits sing at a higher frequency in noise polluted urban surroundings than quieter ones to help overcome the auditory masking that would otherwise impair other birds hearing their song. increased consonant intensity compared to adjacent vowels. Changes between normal and Lombard speech include: Lombard effect . People also tend to make more noticeable facial movements. Hyperspace speech Hyperspace speech. and slower rate. Citation speech Citation speech occurs when people engage self-consciously in spoken language research. It is characterized by a slower speaking rate. The effect links to the needs of effective communication as there is a reduced effect when words are repeated or lists are read wherecommunication intelligibility is not important. a French otolaryngologist. Lombard speech When heard with noise. it has a higher fundamental frequency. It involves modifying the F1 and F2 of phonetic vowel targets to ease perceived difficulties on the part of the listener in recovering information from the acoustic signal. In humans. Clear speech Clear speech is used when talking to a person with a hearing impairment. It has a slower tempo and fewer connected speech processes (e. Since the effect is also involuntary it is used as a means to detect malingering in those simulating hearing loss. "targeted" vowel formants. Screaming Shouted speech is less intelligible than Lombard speech because increased vocal energy produces decreased phonetic information. The Lombard effect or Lombard reflex is the involuntary tendency of speakers to increase the intensity of their voice when speaking inloud noise to enhance its audibility. increased word duration. the Lombard effect results in speakers adjusting not only frequency but also the intensity and rate of pronouncing word syllables.

Such auditory feedback is known to maintain the production of vocalization since deafness affects the vocal acoustics of both humans and songbirds Changing the auditory feedback also changes vocalization in human speechor bird song. or it can be adjusted indirectly in terms of how well listeners can hear the vocalization (public loop). shift in formant center frequencies for F1 (mainly) and F2. Public loop A speaker can regulate their vocalizations at higher cognitive level in terms of observing its consequences on their audience¶s ability to hear it. The Lombard effect also occurs following laryngectomy when people following speech therapy talk with esophageal speech. Choral singing Choral singers experience reduced feedback due to the sound of other singers upon their own voice. increase in vowel duration. Neurology The Lombard effect depends upon audio-vocal neurons in the periolivary region of the superior olivary complex and the adjacent pontine reticular formation. the duration of content words are prolonged to a greater degree in noise than function words.  it is accompanied by larger facial movements but these do not aid as much as its sound changes. increase in sound intensity. In this auditory self-monitoring adjusts vocalizations in terms of learnt associations of what features of their vocalization. though people can learn control with feedback. create effective and efficient communication. Neural circuits have been found in the brainstem that enable such reflex adjustment. Both processes are involved in the Lombard effect. when made in noise.  great lung volumes are used. Mechanisms The intelligibility of an individual¶s own vocalization can be adjusted with audio-vocal reflexes using their own hearing (private loop). The Lombard effect has been found to be greatest upon those words that are important to the listener to understand a speaker suggesting such cognitive effects are important.increase in phonetic fundamental frequencies shift in energy from low frequency bands to middle or high bands. The Lombard effect also occurs to those playing instruments such as the guitar Animal vocalization Noise has been found to effect the vocalizations of animals that vocalize against a background of human noise pollution. There is a development shift however from the Lombard effect being linked to acoustic self-monitoring in young children to the adjustment of vocalizations to aid its intelligibility for others in adults. Development Both private and public loop processes exist in children. spectral tilting. This results in a tendency for people in choruses to sing at a louder level if it is not controlled by a conductor. Great tits in Leiden sing with a higher frequency than        . Private loop A speaker can regulate their vocalizations particularly its amplitude relative to background noise with reflexive auditory feedback. These changes cannot be controlled by instructing a person to speak as they would in silence. It has been suggested that the Lombard effect might also involve the higher cortical areas that control these lower brainstem areas. Trained soloists can control this effect but it has been suggested that after a concert they might speak more loudly in noisy surrounding as in after-concert parties.

fricative consonants (with partially blocked and therefore strongly turbulent airflow). approximants (with only slight turbulence).do those in quieter area to overcome the masking effect of the low frequency background noise pollution of cities. but phoneticians such as Peter Ladefoged consider them to be independent. that is. and the sibilancy of fricatives. Lawrence River estuary adjust their whale song so it can be heard against shipping noise Experimentally. sibilants being the more common. and other speech organs are involved in making a sound make contact. or blocked airflow). lips. and vowels (with full unimpeded airflow). Other parameters Sibilants are distinguished from other fricatives by the shape of the tongue and how the airflow is directed over the teeth. how closely the speech organs approach one another. Parameters other than stricture are those involved in the r-like sounds (taps and trills). Often the concept is only used for the production of consonants. Affricates often behave as if they were intermediate between stops and fricatives. For any place of articulation. manner of articulation describes how the tongue.  Zebra finches Manner of articulation Human vocal tract In linguistics (articulatory phonetics). Often nasality and laterality are included in manner. jaw. but phonetically they are sequences of stop plus fricative. sounds may move along this cline toward less stricture in a process called lenition. and therefore severalhomorganic consonants. . Fricatives at coronal places of articulation may be sibilant or non-sibilant. Beluga whales in the St. speech sounds may be classified along a cline as stop consonants (with occlusion. there may be several manners. The reverse process is fortition. Historically. Stricture From greatest to least stricture. One parameter of manner is stricture. the Lombard effect has also been found in the vocalization of:  Budgerigars  Cats  Chickens  Common marmosets  Cottontop tamarins  Japanese quail  Nightingales  Rhesus Macaques  Squirrel monkey.

where the frication occurs on one or both sides of the edge of the tongue. An oral stop is often called a plosive. though not always. but there is no consensus on what the difference might be. The shape and position of the tongue determine the resonant cavity that gives different nasal stops their characteristic sounds. it is called oral. the only exceptions being in the area of Puget Sound and a single language on Bougainville Island. This can also be combined with other manners. though less common than fricatives. often called a tap. and the air passes instead through the nose. the voicing is the only sound made during occlusion. s/ (voiceless). Affricates are quite common around the world. If the consonant is voiced. rather than just length.  Nasal stop. Increasing the stricture of a typical trill results in a trilled fricative. Examples include English /w/ and /r/. such as Spanish. the Indigenous Australian languages are almost completely devoid of fricatives of any kind. These are by far the most common fricatives. Laterality is the release of airflow at the side of the tongue. which begins like a plosive. lateral flaps. n/. or oral stop. The English letters "ch" and "j" represent affricates. Most languages have fricatives. resulting in lateral approximants (the most common). Sibilants are a type of fricative where the airflow is guided by a groove in the  tongue toward the teeth. Examples include English /p t k/ (voiceless) and /b d g/ (voiced).while a nasal stop is generally just called a nasal. Many linguists distinguish taps from flaps. The shape and position of the tongue (the place of articulation) determine the resonant cavity that gives different plosives their characteristic sounds.  Affricate. and approximants are also found. taps. .  Flap. the two may be combined. It is most commonly found in nasal stops and nasal vowels. However. The "ll" of Welsh and the "hl" of Zulu are lateral fricatives. Nasal airflow may be added as an independent parameter to any speech sound. There are also lateral flaps. where there is very little obstruction.  Fricative. Trills involve the vibration of one of the speech organs. but this releases into a fricative rather than having a separate release of its own. Fricatives at coronal (front of tongue) places of articulation are usually. All languages have plosives. where there is continuous frication (turbulent and noisy airflow) at the place of articulation. usually shortened to nasal. Examples include English /f. z/ (voiced). where there is complete occlusion (blockage) of both the oral and nasal cavities of the vocal tract. Trilled affricates are also known. What we hear as a /p/ or /k/ is the effect that the onset of the occlusion has on the preceding vowel. there are sounds which seem to fall between fricativeand approximant. and the airstream causes it to vibrate. and lateral fricatives and affricates. When a sound is not nasal. etc. Examples include English /m. in which the articulator (usually the tip of the tongue) is held in place.Taps and flaps are similar to very brief stops. where there is complete occlusion of the oral cavity. Since trilling is a separate parameter from stricture.  Trill. though many have only an /s/. No language relies on such a difference. a plosive is completely silent. if it is voiceless.  Lateral fricatives are a rare type of fricative. However. sibilants. In some languages. constitute a class of consonant called rhotics. as well as therelease burst and its effect on the following vowel. Individual manners  Plosive. Trills and flaps. and therefore no air flow. /v. Nearly all languages have nasals. English sibilants include /s/ and /z/. but nasal fricatives. where there are one or more brief occlusions. is a momentary closure of the oral cavity. The "tt" of "utter" and the "dd" of "udder" are pronounced as a flap in North American English. their articulation and behavior is distinct enough to be considered a separate manner. creating a high-pitched and very distinctive sound. The double "r" of Spanish "perro" is a trill.  Approximant. sometimes called spirant.

 Lateral approximants. All ejectives are voiceless. stop or affricate. nasals and liquids. but implosive affricates and fricatives are rare. intonation of speech. but voiced obstruents are extremely common as well. However. Together with the rhotics. The latter are phenomena that can be observed in speech (Saussure's parole) but that do not belong to the arbitrary conventional code of language (Saussure's langue). Sometimes the definition is restricted to vocally-produced sounds. Clicks may be oral or nasal. but do not have the increased stricture of approximants. affricates are considered to be both. Clicks. and /j/ (spelled "y") is the semivowel equivalent of the vowel /i/ in this usage. are a type of approximant pronounced with the side of the tongue. pronounced like a vowel but with the tongue closer to the roof of the mouth. which are glottalic ingressive. and in some languages no air may actually flow into the mouth. English has a click in its "tsk tsk" (or "tut tut") sound. English /l/ is a lateral. and it includes the pitch. but not vowels or semi-vowels). usually shortened to lateral. Paralanguage Paralanguage refers to the non-verbal elements of communication used to modify meaning and convey emotion. The term 'paralanguage' is sometimes used as a cover term for body language. They are extremely rare in normal words outside Southern Africa. but are found in Welsh and Classical Greek (the spelling "rh"). and. approximants.    One use of the word semivowel. and the "wh" in those dialects of English which distinguish "which" from "witch". which is not necessarily tied to speech. sometimes called a glide. fricatives. which are glottalic egressive. liquids. meaning that the air flows outward. causing air to rush in when the forward occlusion (tongue or lips) is released. and paralinguistic phenomena in speech. in some cases. and some linguists prefer that term. Implosives. The word may also be used to cover both concepts. . the airstream is powered by an upward movement of the glottis rather than by the lungs or diaphragm. but the lungs may be used simultaneously (to provide voicing). these form a class of consonant called liquids. Paralanguage may be expressed consciously or unconsciously. restricting the word 'sonorant' to non-vocoid resonants (that is. in Tibetan (the "lh" of Lhasa). The study of paralanguage is known asparalinguistics. That is. Other airstream initiations All of these manners of articulation are pronounced with an airstream mechanism called pulmonic egressive. which are velaric ingressive. Other descriptions usesemivowel for vowel-like sounds that are not syllabic. so that there is slight turbulence. affricates. which have similar behavior in many languages. Voiceless implosives are also rare. affricates) are called obstruents. In English. volume. Manners without such obstruction (nasals. These are found as elements in diphthongs. Another common distinction is between stops (plosives and nasals) and continuants (all else). These are prototypically voiceless. Implosive oral stops are not uncommon. Other airstream mechanisms are possible. Sonorants may also be called resonants. /w/ is the semivowel equivalent of the vowel /u/. and occasionally fricatives may occur as ejectives. Voiceless sonorants are uncommon. Here the glottis moves downward. and another is often used to say "giddy up" to a horse. Sounds that rely on some of these include: Ejectives. Broader classifications Manners of articulation with substantial obstruction of the airflow (plosives. is a type of approximant. Plosives. Here the back of the tongue is used to create a vacuum in the mouth. central or lateral. voiced or voiceless. and is powered by the lungs (actually the ribs and diaphragm). because they are sequences of stop plus fricative. and also vowels) are called sonorants because they are nearly always voiced.

One can distinguish the following aspects of speech signals and perceived utterances: Perspectival aspects Speech signals that arrive at a listener¶s ears have acoustic properties that may allow listeners to localize the speaker (distance. This gives rise to secondary meanings such as 'harmless'. Ordinary phonetic transcriptions of utterances reflect only the linguistically informative quality. and it is not bound to any sensory modality. The organic quality of speech has a communicative function in a restricted sense. e. It will be expressed independently of the speaker¶s intention. are paralinguistic or pre-linguistic in origin. sometimes leading to misunderstandings. A most fundamental and widespread phenomenon of this kind is known as the "frequency code" (Ohala. attitudes are expressed intentionally and emotions without intention. Nonetheless. This voice must have some properties. font and color choices. It has its origin in the fact that the acoustic frequencies in the voice of small vocalizers are high while they are low in the voice of large vocalizers..g. Typically. The perspectival aspects of lip reading are more obvious and have more drastic effects when head turning is involved. The differences concern not only size. also the formant frequencies. In most languages. There are no utterances or speech signals that lack paralinguistic properties. However. paralinguistic applies not only to speech but to writing and sign language as well. 'dominant'. Organic aspects The speech organs of different speakers differ in size. the distinction linguistic vs. In text-only communication such as email. pitch range and. It affects loudness. direction). Nonverbal communication Nonverbal communication (NVC) is usually understood as the process of communication through sending and receiving wordless messages. which characterize the differentspeech sounds. As children grow up. in particular of its prosody. 'submissive'. i. chatrooms and instant messaging. capitalization and the use of non-alphabetic or abstract characters. and all the properties of a voice as such are paralinguistic. Even vocal language has some paralinguistic as well as linguistic properties that can be seen (lip reading. while meanings such as 'dangerous'. This code works even in communication across species. to some extent. speaking rate. language is not . 'unassertive'. their organs of speech become larger and there are differences between male and female adults.e. pitch. Some of the linguistic features of speech. but also proportions. paralanguage in written communication is limited in comparison with face-to-face conversation. Sound localization functions in a similar way also for non-speech sounds. since it is merely informative about the speaker. since speech requires the presence of a voice that can be modulated. and it is reasonable to assume that it has phylogenetically given rise to the sexual dimorphism that lies behind the large difference in pitch between average female and male adults. by the Tadoma method. Expressive variation is central to paralanguage. Linguistic aspects These aspects are the main concern of linguists. Expressive aspects The properties of the voice and the way of speaking are affected by emotions and attitudes. The problem of how listeners factor out the linguistically informative quality from speech signals is a topic of current research. the frequency code also serves the purpose of distinguishing questions from statements. McGurk effect). 1984). but attempts to fake or to hide emotions are not unusual. and 'assertive' are associated with largeness.The paralinguistic properties of speech play an important role in human speech communication. and even felt. They affect the pitch of the voice and to a substantial extent also the formant frequencies. which are naturally associated with smallness. It is universally reflected in expressive variation. paralinguistic elements can be displayed by emoticons.

and behaviors of communicators during interaction. where it can be classified into three principal areas: environmental conditions where communication takes place. a large proportion is also to some extent iconic and may be universally understood. oral communication Scholars in this field usually use a strict sense of the term "verbal". both may contain paralinguistic elements and often occur alongside nonverbal messages. 2000. or singing a wordless note. and do not use "verbal communication" as a synonym for oral or spoken communication. NVC can be communicated through gestures and touch (Haptic communication). our attention is focused on words rather than body language. Clothing and bodily characteristics . touch or taste. rather. smell. Speech contains nonverbal elements known as paralanguage. He argued that all mammals show emotion reliably in their faces. intonation and stress. hairstyles or even architecture. symbols and infographics. p. Body movements are not usually positive or negative in and of themselves. the physical characteristics of the communicators. sadness and surprise are universal. including . An audience is simultaneously processing both verbal and nonverbal cues. spatial arrangement of words. Dance is also regarded as a nonverbal communication. much of the study of nonverbal communication has focused on face-to-face interaction. by body language or posture." (Givens. as both make use of words ² although like speech. NVC can be communicated through object communication such as clothing. sound. NVC is important as: "When we speak (or listen). Paul Ekman's influential 1960s studies of facial expression determined that expressions of anger. including voice quality. vocal sounds that are not considered to be words. 4) History The first scientific study of nonverbal communication was Charles Darwin's book The Expression of the Emotions in Man and Animals (1872). semiotics and social psychology. linguistics. the situation and the message will determine the appraisal. Arbitrariness While much nonverbal communication is based on arbitrary symbols. such as a grunt. fear. Likewise. which differ from culture to culture. Studies now range across a number of fields. by facial expression and eye contact.the only source of communication. Nonverbal communication can occur through any sensory channel ² sight. Verbal vs. joy. there are other means also. However. written texts have nonverbal elements such as handwriting style. as well as prosodic features such as rhythm. meaning "of or concerned with words". Thus. disgust. Sign languages and writing are generally understood as forms of verbal communication. or the use of emoticons. But our judgment includes both. emotion and speaking style. are nonverbal.

temperature. personal. The perception and use of space varies significantly across cultures and different settings within cultures. Often people try to make themselves taller. especially the amount of skin displayed. interior decorating. Interaction territory: this is space created by others when they are interacting. 3. Space in nonverbal communication may be divided into four main categories: intimate.g. Thus. Elements such as physique. For example. someone may sit in the same seat on train every day and feel aggrieved if someone else sits there. for example. hair. carried out in Vienna. and music affect the behavior of communicators during interaction. social. they often exceed that claim. skin color. 69) identify 4 such territories: 1. For example. Hargie & Dickson (2004. Public territory: this refers to an area that is available to all. Gudykunst & TingToomey (1988) identified 2 dominant time patterns: . 4. when they want to make more of an impact with their speaking. there is no ³right´ to occupancy. Chronemics: time in communication Chronemics is the study of the use of time in nonverbal communication. the speed of speech and how long people are willing to listen.Uniforms have both a functional and a communicative purpose. but people may still feel some degree of ownership of a particular space. to some degree. such as a parking space or a seat in a library. it was found that people take longer to leave a parking space when someone is waiting to take that space. height. a study. architectural style. The way we perceive time. The term territoriality is still used in the study of proxemics to explain human behavior regarding personal space. p. were correlated with aspects of the clothing. The furniture itself can be seen as a nonverbal message Proxemics: physical space in communication Proxemics is the study of how people use and perceive the physical space around them. and clothing send nonverbal messages during interaction. Although people have only a limited claim over that space. standing on a platform. lighting conditions. noise. weight. Research into height has generally found that taller people are perceived as being more impressive. The timing and frequency of an action as well as the tempo and rhythm of communications within an interaction contributes to the interpretation of nonverbal messages. and the presence of sheer clothing. Secondary territory: unlike the previous type. others will walk around the group rather than disturb it. when a group is talking to each other on a footpath. structure our time and react to time is a powerful communication tool. hisbadges and shoulder sleeve insignia give information about his job and rank. 2. For example. but only for a set period. Physical environment Environmental factors such as furniture. of the clothing worn by women attending discothèques showed that in certain groups of women (especially women who were in town without their partners) motivation for sex. Time perceptions include punctuality and willingness to wait. gender. at the arms. This man's clothes identify him as male and a police officer. colors. For example. For example. clothing sent signals about interest in courtship. and public space. Melamed & Bozionelos (1992) studied a sample of managers in the UK and found that height was a key factor affecting who was promoted. Primary territory: this refers to an area that is associated with someone who has exclusive use of it. The space between the sender and the receiver of a message influences the way the message is interpreted. and helps set the stage for communication. Austria. e. odors. and levels of sexual hormones. a house that others cannot enter without the owner¶s permission.

arranged and managed. These cultures are much less focused on the preciseness of accounting for each and every moment. 34). such as the German and Swiss. such as the American culture. where "factory life required the labor force to be on hand and in place at an appointed hour" (Guerrero. the invariant pattern of rural life. For Americans.´ Hall says that for monochronic cultures. 1999. ³the schedule is sacred.´ The result of this perspective is that Americans and other monochronic cultures. polychronic cultures are deeply steeped in tradition rather than in tasks²a clear difference from their monochronic counterparts. seconds and even milliseconds. and many in Africa. The United States is considered a monochronic society. and even our favorite TV shows. Canada. p. save time.´ As communication scholar Edward T. work schedules that start and end at certain times. place a paramount value on schedules. As Raymond Cohen notes. spend time and make time. classes that start and end at certain times. India. Mexico. that start and end at a certain time. Philippines. and the calendar of religious festivities" (Cohen. and a more fluid approach is taken to scheduling time. small units.Monochronic Time A monochronic time system means that things are done one at a time and time is segmented into precise. and Scandinavia. Monochronic cultures include Germany. "We buy time. Our time can be broken down into years. rather than watching the clock. Rather. They are not ruled by precise calendars and schedules. They have no problem being ³late´ for an event if they are with family or friends. 1997. Under this system time is scheduled. hours. Hall wrote regarding the American¶s viewpoint of time in the business world. Unlike Americans and most northern and western European cultures. ³time is tangible´ and viewed as a commodity where ³time is money´ or ³time is wasted. Movement and body position Kinesics . Latin American and Arabic cultures use the polychronic system of time. ³cultures that use the polychronic time system often schedule multiple appointments simultaneously so keeping on schedule is an impossibility. minutes. DeVito & Hecht. their culture is more focused on relationships. Instead. The arbitrary divisions of the clock face have little saliency in cultures grounded in the cycle of the seasons. We have schedules that we must follow: appointments that we must go to at a certain time. Cohen notes that "Traditional societies have all the time in the world. 238). months. This perception of time is learned and rooted in the Industrial Revolution. p. polychronic cultures have a much less formal perception of time. tasks and ³getting the job done. United States. Egypt.´ These cultures are committed to regimented schedules and may view those who do not subscribe to the same perception of time as disrespectful. As a result. time is a precious resource not to be wasted or taken lightly. Switzerland. We use time to structure both our daily lives and events that we are planning for the future. because the relationship is what really matters.´ Polychronic cultures include Saudi Arabia. days. Polychronic Time A polychronic time system is a system where several things can be done at once.

arms or body. arm position.Information about the relationship andaffect of these two skaters is communicated by their body posture. and the level of fondness a person has for the other communicator. stance. and body openness. The study was joined by several other anthropologists. nodding. Studies investigating the impact of posture on interpersonal relationships suggest that mirror-image congruent postures. A gesture is a non-vocal bodily movement intended to express meaning. such as winking. can be hard to identify. the difference in status between communicators. Part of Birdwhistell's work involved making film of people in social situations and analyzing them to show different levels of communication not clearly seen otherwise. or rolling ones' eyes. They may be articulated with the hands. eye gaze andphysical contact. where one person¶s left side is parallel to the other person¶s right side. a person who displays a forward lean or a decrease in a backwards lean also signify positive sentiment during communication. and movement. body orientation. leads to favorable perception of communicators and positive speech. including Margaret Mead and Gregory Bateson. Posture is understood through such indicators as direction of lean. Gesture A wink is a type of gesture. an anthropologist who wished to study how people communicate through posture. face and eyes. . Posture Posture can be used to determine a participant¶s degree of attention or involvement. or verbal and nonverbal communication. The boundary between language and gesture. The term was first used (in 1952) by Ray Birdwhistell. and also include movements of the head. gesture.

Speechindependent gestures are dependent upon culturally accepted interpretation and have a direct verbal translation.Although the study of gesture is still in its infancy. such as the hand-wave used in the US for "hello" and "goodbye". picking.For example. These are conventional. Other spontaneous gestures used when we speak are more contentful and may echo or elaborate the meaning of the co-occurring speech. These gestures are closely coordinated with speech. high fives. hand). Speech related gestures are intended to provide supplemental information to a verbal message such as pointing to an object of discussion. Examples of immediacy behaviors are: smiling. These behaviors are referred to as "adapter" or "tells" and may send messages that reveal the intentions or feelings of a communicator. a gesture that depicts the act of throwing may be synchronous with the utterance. a pat on the shoulder. Speech related gestures are used in parallel with verbal speech. this form of nonverbal communication is used to emphasize the message that is being communicated. Another broad category of gestures comprises those gestures used spontaneously when we speak. ranging from complimentary to highly offensive. lips. the relationship between communicators. The most familiar are the so-called emblems or quotable gestures. kissing (cheek. Touching of oneself may include licking. touching. Gestures can also be categorized as either speech-independent or speech-related. The so-called beat gestures are used in conjunction with speech and keep time with the rhythm of speech to emphasize certain words or phrases. Haptics is the study of touching as nonverbal communication. They should not be confused with finger spelling. "He threw the ball right into the window. Humans communicate interpersonal closeness through a series of non-verbal actions known as immediacy behaviors. For a list of emblematic gestures. Touches that can be defined as communication include handshakes." Gestural languages such as American Sign Language and its regional siblings operate as complete natural languages that are gestural in modality. The meaning conveyed from touch is highly dependent upon the context of the situation. and brushing an arm. and scratching. holding hands. Gestures such as Mudra (Sanskrit) encode sophisticated information accessible to initiates that are privy to the subtlety of elements encoded in their tradition. see list of gestures. back slapping.open . Haptics: touching in communication A high five is an example of communicative touch. culture-specific gestures that can be used as replacement for words. holding. and the manner of touch. A wave hello or a peace sign are examples of speech-independent gestures. some broad categories of gestures have been identified by researchers. These types of gestures are integrally connected to speech and thought processes. in which a set of emblematic gestures are used to represent a written alphabet. A single emblematic gesture can have a very different significance in different cultural contexts.

disgust or other forms of emotional rejection unless used in a sarcastic manner.body positions. Eye contact can indicate interest. and involvement. even without sight and hearing. Paralanguage: nonverbal cues of the voice Paralanguage (sometimes called vocalics) is the study of nonverbal cues of the voice. pitch and accent. pinching. and blink rate. and eye contact. Remland and Jones (1995) studied groups of people communicating and found that in England (8%). can all give off nonverbal cues. To say "I was touched by your letter" implies the reader felt a strong emotion when reading it. patterns of fixation. pupil dilation. and frequency of glances. touching someone's head may be thought rude. To 'touch oneself' is a euphemism for masturbation.5%) sample. The development of an infant's haptic senses and how it relates to the development of the other senses such as vision has been the target of much research. pushing. as well as providing information about surfaces and textures it is a component of nonverbal communication in interpersonal relationships. even if they retain sight and hearing. Paralanguage may change the meaning of words. American children were said to be more aggressive than their French counterparts while playing at a playground. In the Thai culture. Touch is the earliest sense to develop in the fetus. a wire feeding apparatus wrapped in softer terry cloth which provided a level of tactile stimulation and comfort. and vital in conveying physical intimacy. As newborns they see and hear poorly but cling strongly to their mothers. looking while listening. were considerably more emotionally stable as adults than those with a mere wire mother. Gaze comprises the actions of looking while talking. During a study conduced by University of Miami School of Medicine. attention. strangling and hand-to-hand fighting are forms of touch in the context of physical abuse. amount of gaze. Cultures that display these immediacy behaviors are known to be high contact cultures. Various acoustic properties of speech such as tone. Babies who can perceive through touch. Trager developed a classification system which consists of the voice set. while only a subset have sight and hearing. pulling.[citation needed] In chimpanzees the sense of touch is highly developed. The linguist George L. The word touch has many other metaphorical uses. Touch Research Institutes. In a sentence like "I never touched him/her" or "Don't you dare to touch him/her" the term touch may be meant as euphemism for either physical abuse or sexual touching. Socially acceptable levels of touching varies from one culture to another. kicking. France (5%) and the Netherlands (4%) touching was rare compared to their Italian (14%) and Greek (12. referring to an action or object that evokes an emotional response. It can be both sexual (such as kissing) and platonic (such as hugging or tickling). Touch can be thought of as a basic sense in that most life forms have a response to being touched. Touch is an extremely important sense for humans. tend to fare much better. Harry Harlow conducted a controversial study involving rhesus monkeys and observed that monkeys reared with a "terry cloth mother".1958) Touching is treated differently from one country to another. Usually does not include anger.(Harlow. Haptic communication is the means by which people and other animals communicate via touching. . and vocalization. Stoeltje (2003) wrote about how Americans are µlosing touch¶ with this important communication skill.[citation needed] Striking. Human babies have been observed to have enormous difficulty surviving if they do not possess a sense of touch. collectively known asprosody. voice qualities. One can be emotionally touched. It was noted that French women touched their children more Eye gaze The study of the role of eyes in nonverbal communication is sometimes referred to as "oculesics".

with coefficients of . i. The interviewees lied in about 50 % of the cases. In reality. and from the facial expression. A voice qualifier is the style of delivering a message . On the other hand. .  Vocalization consists of three subsections: characterizers. is widely cited. It is considered more polite or nicer to communicate attitudes towards others non-verbally rather than verbally. gender. or video recordings." This "rule" that clues from spoken words.. Second. and yawning. Characterizer's are emotions expressed while speaking. Argyle . and facial attitude communications is a weighted sum of their independent effects . as men did not participate in the study.e. qualifiers and segregates. people that are clever at lying can use voice tone and face expression to give the impression that they are truthful The relation between verbal and nonverbal communication The relative importance of verbal and nonverbal communication An interesting question is: When two people are communicating face-to-face. nasality. Fourth. such as laughing. 38 %. it relates only to women. analysed the communication of submissive/dominant attitude and found that non-verbal cues had 4. and . tempo. This is the conclusion of a study where people watched made-up interviews of persons accused of having stolen a wallet.for example. and accent. mood. That is. vocal.3 times the effect of verbal cues. contribute 7 %. age and a person's culture. yelling "Hey stop that!". how much of the meaning is communicated verbally. The more clues that were available to those watching.  The voice qualities are volume. First. a study  . Third. . . using video tapes shown to the subjects. People had access to either written transcripts of the interviews. non-verbal codes are used to establish and maintain interpersonal relationships. or audio tape recordings. Since then. it is extremely weakly founded. a very artificial context. They give each individual a unique "voice print". from the voice tone. rhythm.)) Functions of nonverbal communication Argyle (1970) put forward the hypothesis that whereas spoken language is normally used for communicating information about events external to the speakers. Argyle (1988) concluded there are five primary functions of nonverbal bodily behavior in human communication:  Express emotions  Express interpersonal attitudes  To accompany speech in managing the cues of interaction between speakers and listeners  Self-presentation of one¶s personality  Rituals (greetings) Concealing deception Nonverbal communication makes it easier to lie without being revealed. and how much is communicated non-verbally? This was investigated by Albert Mehrabian and reported in two papers . respectively. as opposed to whispering "Hey stop that". the larger was the trend that interviewees who actually lied were judged to be truthful. This can include the situation.55.07.The voice set is the context in which the speaker is speaking. the figures are obtained by combining results from two different studies which potentially cannot be combined. crying. for instance in order to avoid embarrassing situations . it relates only to the communication of positive versus negative emotions. it is based on the judgment of the meaning of single taperecorded words. pitch. The most important effect was that body posture communicated superior status in a very efficient way. other studies have analysed the relative contribution of verbal and nonverbal signals under more naturalistic situations. resonance. It is presented on all types of popular courses with statements like "scientists have found out that . ". Vocal segregates such as "uhhuh" notify the speaker that the listener is listening. articulation.38. however. and 55 % respectively to the total meaning. The latter paper concluded: "It is suggested that the combined effect of simultaneous verbal.

or frustration.[20] Substituting Nonverbal behavior is sometimes used as the sole channel for communication of a message. Interaction of verbal and nonverbal communication When communicating. nonverbal messages can interact with verbal messages in six ways: repeating. the influence of disease and drugs on receptivity of nonverbal communication was studied by teams at three separate medical schools using a similar paradigm. Nonverbal cues can be used to elaborate on verbal messages to reinforce the information sent when trying to achieve communicative goals. ambivalence. such as pointing to the object of discussion.[19] When mixed messages occur. For example. The amount of . and gestures are some of the tools people use to accent or amplify the message that is sent. touching someone's arm can signal that you want to talk next or interrupt. Dance. both forms have vocabulary (steps and gestures in dance). with its ambiguity and multiple. when nonverbal behavior does not effectively communicate a message. body movements. messages have been shown to be remembered better when nonverbal signals affirm the verbal exchange. Touch. a person who is verbally expressing anger may accent the verbal message by shaking a fist. conflicting. Complementing Accurate interpretation of messages is made easier when nonverbal and verbal communication complement each other. Dance and nonverbal communication Dance is a form of nonverbal communication that requires the same underlying faculty in the brain for conceptualization. Thus. Yale University and Ohio State University had subjects observe gamblers at a slot machine awaiting payoffs.[23]. Nonverbal signals can be used without verbal communication to convey messages. had subjects judge a person on the dimension happy/sad and found that words spoken with minimal variation in intonation had an impact about 4 times larger than face expressions seen in a film without sound. nonverbal communication becomes the primary tool people use to attain additional information to clarify the situation. substituting. verbal methods are used to enhance understanding. A person verbally expressing a statement of truth while simultaneously fidgeting or avoiding eye contact may convey a mixed message to the receiver in the interaction.by Hsee et al. Conflicting messages may occur for a variety of reasons often stemming from feelings of uncertainty.Researchers at the University of Pittsburgh.[21] Regulating Nonverbal behavior also regulates our conversations. assembles (choreographs) these elements in a manner that more often resembles poetry. and body positioning as corresponding with specific feelings and intentions. voice pitch. Repeating "Repeating" consists of using gestures to strengthen a verbal message. creativity and memory as does verbal language in speaking and writing.[21] Accenting/Moderating Nonverbal signals are used to alter the interpretation of verbal messages. Means of self-expression. however.[22] For example. Conflicting Verbal and nonverbal messages within the same interaction can sometimes send opposing or conflicting messages. grammar (rules for putting the vocabulary together) and meaning. People learn to identify facial expressions. great attention is placed on bodily movements and positioning when people perceive mixed messages during interactions. the relative importance of spoken words and facial expressions may be very different in studies using different set-ups. complementing. nonverbal behavior can also be used to moderate or tone down aspects of verbal messages as well. regulating and accenting/moderating. symbolic and elusive meanings. Clinical studies of nonverbal communication From 1977 to 2004.

Because certain drugs enhanced ability while others diminished it. For people who have relatively large difficulties with nonverbal communication. psychiatry. the neurotransmitters dopamine and endorphin were considered to be likely etiological candidate. men with bipolar disorder possessed increased abilities. indicating a predator-prey model. including Asperger syndrome.[27] Obese women[28] and women with premenstrual syndrome[29] were found to also possess diminished abilities to read these cues. A byproduct of the work of the Pittsburgh/Yale/ Ohio State team was an investigation of the role of nonverbal facial cues in heterosexual nondate rape. pediatrics and obstetrics-gynecology achieved significantly higher levels of accuracy than those students who planned to train as surgeons. Males who were serial rapists of adult women were studied for nonverbal receptive abilities. The final target of study for this group was the medical students they taught. however. especially in interpersonal relationships. It was reported that women who had been raped on at least two occasions by different perpetrators had a highly significant impairment in their abilities to read these cues in either male or female senders. Medical students at Ohio State University. Their scores were the highest of any subgroup. A specific group of persons that face these challenges are those with autism spectrum disorders.this payoff was read by nonverbal transmission prior to reinforcement. The authors did note that whatever the nature of these preliminary findings the responsibility of the rapist was in no manner or level. Dr. These groups reported diminished receptive ability in heroin addicts [24] and phencyclidine abusers[25] was contrasted with increased receptivity in cocaine addicts. Robert E. Ohio University and Northest Ohio Medical College were invited to serve as subjects. Because of the changes in levels of accuracy on the levels of nonverbal receptivity. This technique was developed by and the studies directed by psychologist. or pathologists. to a moderate degree.[31]. Based on the available data. Men with major depression[26] manifested significantly decreased ability to read nonverbal cues when compared with euthymic men. James Giannini. In contradistinction. this can pose significant challenges.[30].diminished.[34] These results were troubling. A woman with total paralysis of the nerves of facial expression was found unable to transmit any nonverbal facial cues whatsoever. which attempt to assist those in understanding information which comes more easily to others. Freitas-Magalhaes studied the effect of smile in the treatment of depression and concluded that depressive states decrease when you smile more often. Internal medicine and plastic surgery candidates scored at levels near the mean[35]. There exist resources that are tailored specifically to these people. the members of the research team hypothesized a biochemical site in the brain which was operative for reception of nonverbal cues. A. Dr. the primary cause and primary effect could not be sorted out on the basis of the paradigm employed[32].[33] Rape victims were next tested. Miller and psychiatrist. on average. women are better at nonverbal communication than are men[36][37][38][39]. Measurements of the ability to communicate nonverbally and the capacity to feel empathy have shown that the two abilities are independent of each other [40]. Students indicating a preference for the specialties of family practice. Thus. Phonation Phonation Glottal states From open to closed: Voiceless (full . radiologists. Difficulties with nonverbal communication People vary in their ability to send and receive nonverbal communication.

According to the Source-Filter Theory. the main acoustic cue for the Falsetto percept pitch) accompanied by harmonic overtones. Large scale changes are accomplished by increasing the tension in the vocal folds through contraction of the cricothyroid muscle. are not under sufficient tension or under too much tension. it consists of a fundamental tone (called the fundamental frequency. the Supravocal folds start to oscillate. The oscillation of the Strident (harsh trilled) vocal folds serves to modulate the pressure and flow of the air through the larynx. as may occur when the larynx is lowered or raised. There are currently two main theories as to how vibration of the vocal folds is initiated: the myoelastic theory and the aerodynamic theory. The vocal folds will not oscillate if they are not sufficiently close to one another. A third theory. call Modal voice (maximum this process quasi-periodic vibration voicing. the main acoustic cue for the percept pitch. In Whisper other words. Voicing Glottalized (blocked The phonatory process. voiceless phones are associated with vocal folds that are elongated. highly tensed. phonation is the process by which thevocal folds produce certain sounds Breathy voice (murmur) through quasi-periodic vibration. In linguistics. and they use the vibration) termphonation to refer to any oscillatory state of any part of the larynx that modifies the airstream. either volitionally or through movement of the tongue to which the larynx is attached via the hyoid bone. Fundamental frequency. However. This is the definition used among those who study laryngeal anatomy and physiology and Slack voice speech production in general. there is almost no motion along the length of the vocal folds. creating a pressure drop across the larynx. The sound that the larynx produces is a harmonic series. voiceless and supra-glottal phonation are included under this definition. When this drop becomes sufficiently large. occurs when air is expelled airstream) from the lungs through the glottis. and this modulated airflow is the main Non-phonemic phonation component of the sound of most voiced phones. In addition to tension changes. . fundamental frequency is also affected by the pressure drop across the larynx. which are multiples of the fundamental frequency . and will also vary with the distance between the vocal folds. In speech. though. The minimum pressure drop glottal phonation required to achieve phonation is called the phonation threshold pressure. Among some phoneticians. Smaller changes in tension can be effected by contraction of the thyroarytenoid muscle or changes in the relative position of the thyroid and cricoid cartilages. The motion of the vocal folds during voice ("hollow") oscillation is mostly in the lateral direction. These two theories are not in contention with one another and it is quite possible that both theories are true and operating simultaneously to initiate and maintain vibration. it is Faucalized approximately 2±3 cm H2O. As such. or voicing.Phonation has slightly different meanings depending on the airstream) subfield of phonetics. Other phoneticians. though there is also Harsh voice ("pressed") some superior component as well. or if the pressure drop across the larynx is not sufficiently large. and placed laterally (abducted) when compared to vocal folds during phonation. which is common in the field of Creaky voice (restricted airstream) linguistic phonetics. Variation in fundamental frequency is used linguistically to produce intonation and tone. a phone is called voiceless if there is no phonation during its occurrence. which is mostly affected by the pressure in the lungs. of which voicing is just Stiff voice one example. and for humans with normal vocal folds. the resulting sound excites the resonance chamber that is the vocal tract to produce the individual speech sounds. can be varied through a variety of means.

More intricate mechanisms were occasionally described. Speech and voice scientists have long since left this theory as the muscles have been shown to not be able to contract fast enough to accomplish the vibration. the vocal cords block the airstream. the cords do not vibrate. In between there is a sweet spot of maximum vibration. the air flow is cut off until breath pressure pushes the folds apart and the flow starts up again. and is extremely common with obstruents.I. when the glottis is divergent. causing the cycles to repeat.. but they were difficult to investigate. a push-pull effect is created on the vocal fold tissues that maintains self-sustained oscillation. which would not be possible according to this theory. The aerodynamic theory is based on the Bernoulli energy law in fluids. For example. This is voicelessphonation. The rate at which the cords open and close²the number of cycles per second²determines the pitch of the phonation. and there are several intermediate situations utilized by various languages to make contrasting sounds. the sail shapes the vocal cords. Advocates of this theory thought that every single vibration of the vocal folds was due to an impulse from the recurrent laryngeal nerves and that the acoustic center in the brain regulated the speed of vocal fold vibration. and the whole cycle keeps repeating itself. During glottal closure. the aperture of the arytenoid cartilages. ed. the cords remain closed until the pressure beneath them²the subglottic pressure²is sufficient to push them apart. Phonation occurring in excised larynges would also not be possible according to this theory. The textbook entitled Myoelastic Aerodynamic Theory of Phonation by Ingo Titze cr . 2006 Neurochronaxic theory This theory states that the frequency of the vocal fold vibration is determined by the chronaxy of the recurrent nerve. In linguistic phonetic treatments of phonation. If the vocal cords are completely relaxed. However. while Burmese has vowels with a partially tense phonation called creaky voice orlaryngealized. As the state of the glottis A continuum from closed glottis to open. whereas the pull occurs during glottal closing. If the arytenoids are pressed together for glottal closure. Pressure builds up once again until the cords are pushed apart. The push occurs during glottal opening.the neurochronaxic theory. producing stop sounds such as the glottal stop. and is the normal state for vowels and sonorants in all the world's languages. The black triangles represent the arytenoid cartilages. and the dotted circle the windpipe.R. allowing air to escape and reducing the pressure enough for the muscle tension recoil to pull the folds back together again. Gujarati has vowels with a partially lax phonation called breathy voice or murmured. phonation was considered to be a matter of points on a continuum of tension and closure of the vocal cords. is one of degree between the end points of open and closed.Template:Titze. Myoelastic and aerodynamic theory The myoelastic theory states that when the vocal cords are brought together and breath pressure is applied to them. and therefore the tension in the vocal cords. Both of these phonations have dedicated IPA . persons with paralyzed vocal folds can produce phonation. such as those of Peter Ladefoged. and until recently the state of the glottis and phonation were considered to be nearly synonymous. was in considerable vogue in the 1950s. s Janwillem van den Berg as the originator of the theory and provides detailed mathematical development of the theory. This is modal voice. but has since been largely discr . when the glottis is convergent. with the arytenoid cartilages apart for maximum airflow. In addition. The theory states that when a stream of breath is flowing through the glottis while the arytenoid cartilages are held together by the action of the interarytenoid muscles. and not by breath pressure or muscular tension.

are: Open glottis [t] voiceless (full airstream) [d] breathy voice [d] slack voice Sweet spot [d] modal voice (maximum vibration) [d] stiff voice [d] creaky voice Closed glottis [ t] glottal closure (blocked airstream) The IPA diacritics under-ring and subscript wedge. h] do not behave like other consonants. both phonologically and historically. they contrast with tenuis and aspirated consonants.diacritics. at its sweet spot. t. (Ironically. k/ in French borrowings. /d/. it is convenient to classify these degrees of phonation into discrete categories. However. The latter two translations may have been mixed up. Phonetically.) Alsatian. The Jalapa dialect of Mazatec is unusual in contrasting both with modal voice in a three-way distinction. respectively. but without the open glottis usually associated with voiceless stops. so the glottis is making several tonal distinctions simultaneously with the phonation distinctions. Javanese does not have modal voice in its plosives. A series of seven alveolar plosives. The consonants transcribed /b/. in Semitic languages they do appear to be true glottal consonants. not more. d. they are technically voiceless. adding the 'voicing' diacritic to the symbol for a voiced consonant indicates less modal voicing. (Note that Mazatec is a tonal language. with phonations ranging from an open/lax to a closed/tense glottis. like several Germanic languages. The "muddy" consonants in Shanghainese are slack voice.) Mazatec breathy voice [ja] he wears modal voice creaky voice [já] tree [ja] he carries Note: There was an . ing error in the source of this information. at least in many European languages. but contrasts two other points along the phonation scale. the glottal consonants [ . with more moderate departures from modal voice. breathy voice for [ ]. That is. and any further tension in the vocal cords dampens their vibration. as well as aspirated /k / word initially. . an under-umlaut and under-tilde. because a modally voiced sound is already fully voiced. they have nomanner or place of articulation other than the state of the glottis: glottal closure for [ ]. has a typologically unusual phonation in its stops. are sometimes added to the symbol for a voiced sound to indicate more lax/open (slack) and tense/closed (stiff) states of the glottis. They contrast with both modally voiced /b. called slack voice and stiff voice. Some phoneticians have described these sounds as neither glottal nor consonantal. Although each language may be somewhat different. / / (ambiguously called "lenis") are partially voiced: The vocal cords are positioned as for voicing. but instead as instances of pure phonation. but do not actually vibrate. / and modally voiceless /p. and open airstream for [h]. Supra-glottal phonation . commonly called "voiceless" and "voiced". Glottal consonants It has long been noted that in many languages.

Thai. epiglotto-pharyngeal (retraction of the tongue and epiglottis. Kabiye (faucalized and harsh voice. Nuuchahnulth. Familiar language examples In languages such as French. Korean. previously seen as ATR). and arytenoid activity (for something other than epiglottal consonants) has been observed in Tibetan. Examples are. which involves overall constriction of the larynx. and faucalized voice ('hollow' or 'yawny' voice). and possessive endings spelled -s (voiced in kids /k dz/ but voiceless in kits /k ts/) and the past-tense ending spelled -ed (voiced in buzzed /b zd/ but voiceless in fished /f t/. glottal (the vocal cords). producing the distinctions described above 2. Pame.Tigrinya. such as Finnish. For the pairs of English plosives. Ca ntonese.In the last few decades it has become apparent that phonation may involve the entire larynx. [a]. verbal. Voice modal breathy harsh t ìt t ìt faucalized t ìt Bor Dinka t ìt diarrhea go ahead scorpions to swallow Other languages with these contrasts are Bai (modal. breathy. in Australian languages it is nearly universal. breathy. all obstruents occur in pairs. Outside of Europe. Arabic. narrowing of the pharynx Until the development of fiber-optic laryngoscopy. For example. potentially closing onto the pharyngeal wall) 5. however.[citation needed] A few European languages. and harsh voice). These are harsh voice ('ventricular' or 'pressed' voice). arytenoid (sphincteric compression forwards and upwards) 4. the full involvement of the larynx during speech production was not observable. From the glottis upward. However. The ad hoc diacritics employed in the literature are a subscript double quotation mark for faucalized voice. The Bor dialect of Dinka has contrastive modal. one modally voiced and one voiceless. and harsh voice in its vowels. simultaneous glottal.   . such as the plural. and voiceless elsewhere. Amis. and Yi. at least two supra-glottal phonations appear to be widespread in the world's languages. faucalized. Sui. a lack of voicing distinctions is not uncommon. Nlaka'pamux. and the interactions among the six laryngeal articulators is still poorly understood. as well as three tones. Elements of laryngeal articulation or phonation may occur widely in the world's languages as phonetic detail even when not phonemically contrastive. [a]. ventricular. have no phonemically voiced obstruents but pairs of long and short consonants instead. with as many as six valves and muscles working either independently or together. indeed. which involves overall expansion of the larynx. the distinction is better specified as voice onset time rather than simply voice: In initial position /b d g/ are only partially voiced (voicing begins during the hold of the consonant). these articulations are: 1. raising or lowering of the entire larynx 6.[citation needed] Certain English morphemes have voiced and voiceless allomorphs. it is often found that they are realized as voiced in voiced environments such as between vowels. Somali (breathy and harsh voice).[citation needed] In English. ventricular (the 'false vocal cords'. and underlining for harsh voice. In languages without the distinction between voiceless and voiced obstruents. partially covering and damping the glottis) 3. every voiced fricative corresponds to a voiceless one. while /p t k/ are aspirated (voicing doesn't begin until well after its release).

and it was in these papers that the term formant was first introduced. and movement of articulators or speech organs. creaky voice with high tone. which may be bounded by vocal breaks  A particular phonation  A resonance area such as chest voice or head voice  A certain vocal timbre Four combinations of these elements are identified in speech pathology: the vocal fry register. such as the upper. falling) is found. that allowed the speech signal to be recorded and then later processed and analyzed. and vocal folds. which possesses a characteristic sound quality. Phonetics Phonetics (from the Greek: . low. For example. The Ancient Greeks are cr . In pedagogy and speech pathology Among vocal pedagogues and speech pathologists. voice") is a branch of linguistics that comprises the study of the sounds of human speech. or lower registers. History Phonetics was studied as early as 2500 years ago in ancient India. The major Indic alphabets today order their consonants according to P ini's classification. the modal register. and the whistle register. and neurophysiological status. whose Visible Speech (1867) introduced a system of precise notation for writing down speech sounds. ed as the first to base a writing system on a phonetic alphabet. tongue. By replaying the same speech signal from the phonograph several times. creak. acoustic properties. is concerned with abstract. a spectrogram of the speech utterance could be built up.Vocal registers In phonology In phonology. a vocal register also refers to a particular phonation limited to a particular range of pitch. Hermann also played back vowel recordings made with the Edison phonograph at different speeds to distinguish between Willis' and Wheatstone's theories of vowel production. the falsetto register. breath. The studies about phonetic was strongly enhanced in the late 19th century. auditory perception. middle. grammatical characterization of systems of sounds. ph n . shape. such as the lips. It is concerned with the physical properties of speech sounds (phones): their physiological production. and glottal closure with high tone. filtering it each time with a different band-pass filter. . Phonology. a register is a combination of tone and vowel phonation into a single phonological parameter. The term "register" may be used for several distinct aspects of the human voice::  A particular part of the vocal range. on the other hand. A series of papers by Ludimar Hermann published in Pflüger's Archiv in the last two decades of the 19th century investigated the spectral properties of vowels and consonants using the Edison phonograph. but no other combination of phonation (modal. "sound. with P ini's account of the place and manner of articulation of consonants in his 5th century BC treatise on Sanskrit. among its vowels. breathy voice with falling tone. also for invention of phonograph. Subfields Phonetics as a research discipline has three main branches:  articulatory phonetics is concerned with the articulation of speech: The position. Modern phonetics began with Alexander Melville Bell. These four registers contrast with each other. closed) and tone (high. Burmese combines modal voice with low tone.

and recognition of speech sounds and the role of the auditory system and the brain in the same. and/or perceptual representations. Transcription Main article: Phonetic transcription Phonetic transcription is a system for transcribing sounds that occur in spoken language or signed language. blood. dialects. The IPA is a useful tool not only for the study of phonetics.. Before puberty.phonemes. e. While it is widely agreed that phonology is grounded in phonetics. the term puberty (derived from the Latin word puberatum (age of maturity. Adolescence largely overlaps the period of puberty.) and their conditioned variation (via. Phonology relates to phonetics via the set of distinctive features. categorization. or derivational rules). uses a one-to-one mapping between phones and written symbols. manhood)) refers to the bodily changes of sexual maturation rather than the psychosocial and cultural aspects of adolescent development. etc. shape. relating such concerns with other levels and aspects of language. body differences between boys and girls are almost entirely restricted to the genitalia. Puberty Puberty is the process of physical changes by which a child's body becomes an adult body capable of reproduction. professional acting. and how they are perceived. composition. phonology is the study of how sounds and gestures pattern in and across languages.  Speech Recognition: the analysis and transcription of recorded speech by a computer system. The most obvious of these are referred to as secondary sex characteristics. As part of this investigation. phoneticians may concern themselves with the physical properties of meaningful sound contrasts or the social meaning encoded in the speech signal (e. etc. hair.  auditory phonetics is concerned with speech perception: the perception. However.acoustic phonetics is concerned with acoustics of speech: The spectro-temporal properties of the sound waves produced by speech. Relation to phonology In contrast to phonetics. During puberty. constraints. but its boundaries are less precisely defined and it refers as  . gender. the gonads produce a variety of hormones that stimulate the growth. breasts. mora. and speech pathology. which map the abstract representations of speech units to articulatory gestures. The standardized nature of the IPA enables its users to transcribe accurately and consistently the phones of different languages. The most widely known system of phonetic transcription. Adolescence is the period of psychological and social transition between childhood and adulthood. muscle. sexuality. concerned with sounds and gestures as abstract units (e. and harmonic structure. syllables. major differences of size. a substantial portion of research in phonetics is not concerned with the meaningful elements in the speech signal. Applications Application of phonetics include:  forensic phonetics: the use of phonetics (the science of speech) for forensic (legal) purposes. or transformation of brain. features. allophonic rules. In a strict sense. such as theirfrequency. skin. function. Growth accelerates in the first half of puberty and stops at the completion of puberty.g. phonology is a distinct branch of linguistics. how they are produced. Phonetics deals with the articulatory and acoustic properties of speech sounds. and idiolects. but also for language teaching. and reproductive organs. the International Phonetic Alphabet (IPA).g. and function develop in many body structures and systems..).g. In response. bones. amplitude. ethnicity. acoustic signals. Puberty is initiated by hormone signals from the brain to the gonads(the ovaries and testes).

2 inches) taller than women. Although there is a wide range of normal ages. a substantial product of testosterone metabolism in males is estradiol. Puberty is marked in green at right. Any increase in height beyond these ages is uncommon. girls typically begin the process of puberty at age 10. while boys usually complete puberty by ages 16±18. and the major sex steroids involved. While testosterone produces all boys' changes characterized as virilization. Most of this sex difference in adult heights is attributable to a later onset of the growth . though levels rise later and more slowly than in girls. accelerates more slowly. boys accelerate more slowly but continue to grow for about 6 years after the first visible pubertal changes. and lasts longer before theepiphyses fuse. In contrast. The male "growth spurt" also begins later. Approximate outline of development periods in child and teenager development. an androgen called testosterone is the principal sex hormone. adult men are on average about 13 cm (5. Although boys are on average 2 cm shorter than girls before puberty begins.much to the psychosocial and cultural characteristics of development during the teen years as to the physical changes of puberty.PRL For boys. 1 Follicle-stimulating hormone FSH 2 Luteinizing hormone LH 3 Progesterone 4 Estrogen 5 Hypothalamus 6 Pituitary gland 7 Ovary 8 Pregnancy hCG (Human chorionic gonadotropin) 9 Testosterone 10 Testicle 11 Incentives 12 Prolactin . Girls usually complete puberty by ages 15±17. boys at age 12. Girls attain reproductive maturity about 4 years after the first physical changes of puberty appear. Differences between male and female puberty Two of the most significant differences between puberty in girls and puberty in boys are the age at which it begins.

Pubic hair Pubic hair often appears on a boy shortly after the genitalia begin to grow. Puberty onset The onset of puberty is associated with high GnRH pulsing. when the hairs are too many to count. By stage 4. The researchers hypothesized that Neurokinin B might play a role in regulating the secretion ofKisspeptin. reaching maximal adult size about 6 years after the onset of puberty. the pubic hairs densely fill the "pubic triangle. The pubic hairs are usually first visible at the dorsal (abdominal) base of the penis. there is wide variation in testicular size in the normal population. but full fertility will not be gained until 14±16 years of age[citation needed]. Testicular size continues to increase throughout puberty. function.5±2 cm in width.spurt and a slower progression to completion.Exogenous GnRH pulses cause the onset of puberty. A study discovered that a mutation in genes encoding both Neurokinin B as well as the Neurokinin B receptor can alter the timing of puberty. averaging about 2±3 cm in length and about 1. testicular enlargement is the first physical manifestation of puberty (and is termed gonadarche). Testes in prepubertal boys change little in size from about 1 year of age to the onset of puberty. which precedes the rise in sex hormones. The testes have two primary functions: to produce hormones and to produce sperm. a direct result of the later rise and lower adult male levels of estradiol. Leptin has receptors in the hypothalamus which synthesizes GnRH. The rise in GnRH might also be caused by genetics. a male's scrotum will become larger and begin to dangle or hang below the body as opposed to being up tight. Individuals who are deficient in leptin fail to initiate puberty. LH and FSH. While 18±20 cc is an average adult size. While estradiol promotes growth of breasts anduterus. Stage 3 is usually reached within another 6±12 months. The hormone that dominates female development is an estrogen called estradiol. After the boy's testicles have enlarged and developed for about one year. During puberty. The Leydig cells produce testosterone. Estradiol levels rise earlier and reach higher levels in women than in men. the length and then the breadth of the shaft of the penis will increase and the glans penis and corpora cavernosa will also start to enlarge to adult proportions. Brain tumors which increase GnRH output may also lead to premature puberty The cause of the GnRH rise is unknown." Stage 5 refers to the spread of pubic hair to the thighs and upward towards the navel as part of the developing abdominal hair. and then decline to adult levels when puberty is completed. it is also the principal hormone driving the pubertal growth spurt and epiphyseal maturation and closure. a compound responsible for triggering direct release of GnRH as well as indirect release of LH and FSH. Most of the increasing bulk of testicular tissue is spermatogenic tissue (primarily Sertoli and Leydig cells). The first few hairs are described as stage 2. and occasionally earlier[citation needed]. Sperm can be detected in the morning urine of most boys after the first year of pubertal changes. Body and facial hair . potential fertility in boys is reached at 13 years old. The levels of leptin increase with the onset of puberty. On average. Physical changes in boys Testicular size. and fertility In boys. which in turn produces most of the male pubertal changes. Leptin might be the cause of the GnRH rise. to accommodate the production of sperm whereby the testicles need a certain temperature to be fertile.

causing the male voice to drop and deepen. Some men do not develop full facial hair for up to 10 years after the completion of puberty. the fat pads of the male breast tissue and the male nipples will develop during puberty. adult men have heavier bones and nearly twice as much skeletal muscle. and significant differences in timing and quantity of hair growth among different racial groups. it is not unusual for a fully grown adult to suffer the occasional bout of acne. the larynx of boys and girls is about equally small.Facial hair of a male that has been shaved In the months and years following the appearance of pubic hair. but it typically fully diminishes at the end of puberty. It usually precedes the development of significant facial hair by several months to years. is attained about one year after a male experiences his peak growth rate. periareolar hair. Not all men have chest hair. There is a large range in amount of body hair among adult men. because the longer and thicker vocal folds have a lower fundamental frequency. sideburn (preauricular) hair. but may not appear until significantly later. This growth is far more prominent in boys. As in girls. though it is normally less severe than in adolescents. leg. this specific order may vary among some individuals. This muscle develops mainly during the later stages of puberty. Full adult pitch is attained at an average age of 15 years. Most of the voice change happens during stage 3-4 of male puberty around the time of peak growth. darker and thicker for another 2±4 years after puberty. shoulder width and jaw) is disproportionately greater. Facial hair is often present in late adolescence. upper lip hair. the voice box. and back hair become heavier more gradually.g. due to the fact that acne is emotionally difficult and can cause scarring. The average adult male has about 150% of the lean body mass of an average female. resulting in noticeably different male and female skeletal shapes. the rate of muscle growth. and muscle growth can continue even after boys are biologically adult. and the beard area. grows in both sexes. another androgen effect is increased secretion of oil (sebum) from the skin and the resultant variable amounts of acne. perianal hair. and about 50% of the body fat. Some of the bone growth (e. Often. The peak of the so-called "strength spurt". Male musculature and body shape By the end of puberty. Acne can not be prevented or diminished easily. chest. Voice change Under the influence of androgens. Some may desire using prescription topical creams or ointments to keep acne from getting worse. abdominal. this becomes more apparent and is termedgynecomastia. or larynx. sometimes. However. sometimes abruptly but rarely "over night. voice change is accompanied by unsteadiness of vocalization in the early stages of untrained voices. The usual sequence is: underarm (axillary) hair. Arm. Physical changes in girls Breast Development . Body odor and acne Rising levels of androgens can change the fatty acid composition of perspiration.[20] Occasionally. It is usually not a permanent phenomenon. especially in one breast. other areas of skin that respond to androgens may develop androgenic hair. resulting in a more "adult" body odor. Before puberty. As with most human biological processes." about one octave. Chest hair may appear during puberty or years after.[19] Facial hair will continue to get coarser. or even oral medication.

and the follicles in the ovaries increase in size. upper arms. fat distribution. softened. buttocks. ovaries. Menstruation and fertility The first menstrual bleeding is referred to as menarche. becoming thicker and duller pink in color (in contrast to the brighter red of the prepubertal vaginal mucosa).[29] In postmenarchal girls. a skin condition that is characteristic of puberty.[23] The average age of menarche in girls is 11. especially in the typical female distribution of breasts. Within six to 12 months. this is stage 2 of breast development (stage 1 is a flat. girls have 6% more body fat than boys. at 10 years. and pubis. uterus. thighs. By stage 4. Progressive differences in fat distribution as well as sex differences in local skeletal growth contribute to the typical female body shape by the end of puberty. The first few hairs are described as Tanner stage 2. the uterus.[23] The time between menstrual periods (menses) is not always regular in the first two years after menarche. By another 12 months (stage 4).[33] Acne varies greatly in its severity. although there is so much variation in sizes and shapes of adult breasts that stages 4 and 5 are not always separately identifiable. Body shape.[28] Ovulation is necessary for fertility.[30] Nubility is used to designate achievement of fertility.[23] It is referred to as pubarche.[32] Body odor and acne Rising levels of androgens can change the fatty acid composition of perspiration. and typically occurs about two years after thelarche. but may or may not accompany the earliest menses. usually within a few months of thelarche.[23] Vagina. the earliest pubic hair appears before breast development begins.[21] This is referred to as thelarche. about 80% of the cycles were anovulatory in the first year after menarche. 50% in the third year and 10% in the sixth year. the pubic hairs densely fill the "pubic triangle.[33] Timing of the onset of puberty . ovaries The mucosal surface of the vagina also changes in response to increasing levels of estrogen. On average. occurring on average at about 10. also in response to rising levels of estrogen. the lower half of the pelvis and thus hips widen (providing a larger birth canal).[24] Whitish secretions (physiologic leukorrhea) are a normal effect of estrogen as well. Pubic hair Pubic hair is often the second noticeable change in puberty. By the widely used Tanner staging of puberty.[22][31] Fat tissue increases to a greater percentage of the body composition than in males. this mound disappears into the contour of the mature breast (stage 5).5 years of age. the breasts are approaching mature size and shape. Another androgen effect is increased secretion of oil (sebum) from the skin." Stage 5 refers to spread of pubic hair to the thighs and sometimes as abdominal hair upward towards thenavel. prepubertal breast).[22] Stage 3 is usually reached within another 6±12 months. A high proportion of girls with continued irregularity in the menstrual cycle several years from menarche will continue to have prolonged irregularity and anovulation. tender lump under the center of the areola of one or both breasts. when the hairs are too numerous to count and appear on the pubic mound as well. This is stage 3 of breast development. This change increases the susceptibility to acne.[25] The ovaries usually contain small follicular cysts visible by ultrasound.[21] In the two years following thelarche. and are at higher risk for reduced fertility.75 years. The pubic hairs are usually visible first along thelabia. and can be felt and seen extending beyond the edges of the areolae. with areolae and papillae forming a secondary mound. the swelling has clearly begun in both sides. In about 15% of girls. resulting in a more "adult" body odor.The first physical sign of puberty in girls is usually a firm. This often precedes thelarche and pubarche by one or more years. In most young women.[28] Initiation of ovulation after menarche is not inevitable. and body composition During this period. hips.

[36] Historical shift The average age at which the onset of puberty occurs has dropped significantly since the 1840s. The most important of the environmental influences is clearly nutrition. hormonal versus physical) and purpose (establishing population normal standards. and the width of the spread may reflect unevenness of wealth and food distribution in a population. The age at which puberty begins varies between individuals usually. as evidenced by breast development.) The most commonly used definition of the onset of puberty is physical changes to a person's body[citation needed]. and by certain chemicals. For example. A 2006 study in Denmark found that puberty.[45] Researchers[46] have hypothesized that early puberty onset may be caused by certain hair care products containing estrogen or placenta. and gonadal function changes. environment factors are clearly important as well.[34] An example of social circumstances is the Vandenbergh effect. a juvenile female who has significant interaction with adult males will enter puberty earlier than juvenile females who are not socially overexposed to adult males. namely phthalates. much of the higher age averages reflect nutritional limitations more than genetic differences and can change within a few generations with a substantial change in diet. all which affect timing of female puberty and menarche more clearly than male puberty. girls born in 1840 had their menarche at an average age of 17 years. The specific genes affecting timing are not yet known.[40] Genetic influence and environmental factors Various studies have found direct genetic effects to account for at least 46% of the variation of timing of puberty in well-nourished populations. However. The median age of menarche for a population may be an index of the proportion of undernourished girls in the population. and plastic food containers. The age at which puberty begins is affected by both genetic factors and by environmental factors such as nutritional state and social circumstances. and 12% of white girls by that age. In England the average in 1840 was 16.[35] The average age at which puberty begins may be affected by race as well. hormonal. etc. from Britain).5 years..[41][42][43][44] The genetic association of timing is strongest between mothers and daughters. One of the first observed environmental effects is that puberty occurs later in children raised at higher altitudes.3 years. they have based their conclusions on a comparison of data from 1999 with data from 1969. In the earlier example.g. In every decade from 1840 to 1950 there was a drop of four months in the average age of menarche among Western European females. toys.The definition of the onset of puberty depends on perspective (e. In France the average in 1840 was 15.[41] Among the candidates is an androgen receptor gene. which are used in many cosmetics. In Norway. the average age of menarche in various populations surveyed has ranged from 12 to 18 years. a year earlier than when a similar study was done in 1991. Scientists believe the phenomenon could be linked to obesity or exposure to chemicals in the food chain. In Japan the decline happened later and was then more rapid: from 1945 to 1975 in Japan there was a drop of 11 months per decade. If genetic factors account for half of the variation of pubertal timing. puberty begins between 10-13. and is putting girls at greater long-term risk of breast cancer. the sample population was based on a small sample of white girls (200. These physical changes are the first visible signs of neural. However. Researchers have identified an earlier age of the onset of puberty. but a number of others have been identified. The earliest average onset of puberty is for African-American girls and the latest average onset for high altitude subsistence populations in Asia. clinical care of early or late pubescent individuals. started at an average age of 9 years and 10 months. .[37][38][39] Researchers[who?] refer to this drop as the 'secular trend'. The later study identified as puberty as occurring in 48% of African-American girls by age nine.

mammary gland. It leaches out of plastic into liquids and foods. Sex steroids are sometimes used in cattle farming but have been banned in chicken meat production for 40 years. and is frequently used to make baby bottles. water bottles. and even among social classes in the same population. Lower protein intakes and higherdietary fiber intakes. BPA mimics and interferes with the action of estrogen-an important reproduction and development regulator. Harder to detect as an influence on puberty are the more diffusely distributed environmental chemicals like PCBs (polychlorinated biphenyl). the rules are largely self-enforced in the United States. They have cited obesity as a cause of breast development before nine years and menarche before twelve years. Surplus calories (beyond growth and activity requirements) are reflected in the amount of body fat. that environmental hormones and chemicals may affect aspects of prenatal or postnatal sexual development in humans. and are sometimes detectable in the environment. A high level of exercise. sports equipment. nutritional differences accounted for majority of variation of pubertal timing in different populations. Much evidence suggests that for most of the last few centuries.S.[41] Girls are especially sensitive to nutritional regulation because they must contribute all of the nutritional support to a growing fetus. and lead to early puberty in girls. other changes in nutrition. but mild effects and the other potential exposures outlined above would not. whether for athletic or body image purposes. mainly in those populations with the higher previous ages. and as a coating in food and beverage cans. Bisphenol A (BPA) is a chemical used to make plastics. and increases in childhood obesity have resulted in falling ages of puberty. are associated with later onset and slower progression of female puberty. Many plastic baby bottles contain BPA. as when one warms a baby bottle or warms up food in the microwave. and the Centers for Disease Control and Prevention (CDC) found measurable amounts of BPA in the bodies of more than 90 percent of the U. Obesity influence and exercise Scientific researchers have linked early obesity with an earlier onset of puberty in girls. and animal evidence. The highest estimated daily intakes of BPA occur in infants and children.[50] The average level of daily physical activity has also been shown to affect timing of puberty. as occur with typical vegetarian diets.Hormones and steroids There is theoretical concern.[49] Early puberty in girls can be a harbinger of later health problems. Significant exposure of a child to hormones or other substances that activate estrogen or androgen receptors could produce some or all of the changes of puberty. and BPA is more likely to leach out of plastic when its temperature is increased. Although available dietary energy (simple calories) is the most important dietary influence on timing of puberty. infants. especially in females. which signals to the brain the availability of resources for initiation of puberty and fertility. medical devices.[48] Nutritional influence Nutritional factors are the strongest and most obvious environmental factors affecting timing of puberty. population studied. Although agricultural laws regulate use to minimize accidental human consumption. In many populations the amount of variation attributable to nutrition is shrinking. Scientists are concerned about BPA's behavioral effects on fetuses. More obvious degrees of partial puberty from direct exposure of young children to small but significant amounts of pharmaceutical sex steroids from exposure at home may be detected during medical evaluation for precocious puberty. Recent worldwide increased consumption of animal protein. and children at current exposure levels because it can affect the prostate gland. . which can bind and trigger estrogen receptors. quality of the diet plays a role as well.[47] Large amounts of incompletely metabolized estrogens and progestagens from pharmaceutical products are excreted into the sewage systems of large cities.

or for daily subsistence, reduces energy calories available for reproduction and slows puberty. The exercise effect is often amplified by a lower body fat mass and cholesterol. Physical and mental illness Chronic diseases can delay puberty in both boys and girls. Those that involve chronic inflammation or interfere with nutrition have the strongest effect. In the western world, inflammatory bowel disease and tuberculosis have been notorious for such an effect in the last century, while in areas of the underdeveloped world, chronic parasite infections are widespread. Mental illnesses occur in puberty. The brain undergoes significant development by hormones which can contribute to mood disorders such as Major depressive disorder, bipolar disorder,dysthymia and schizophrenia. Girls aged between 15 and 19 make up 40% of anorexia nervosa cases.[51] Stress and social factors Some of the least understood environmental influences on timing of puberty are social and psychological. In comparison with the effects of genetics, nutrition, and general health, social influences are small, shifting timing by a few months rather than years. Mechanisms of these social effects are unknown, though a variety of physiological processes, includingpheromones, have been suggested based on animal research. The most important part of a child's psychosocial environment is the family, and most of the social influence research has investigated features of family structure and function in relation to earlier or later female puberty. Most of the studies have reported that menarche may occur a few months earlier in girls in high-stress households, whose fathers are absent during their early childhood, who have a stepfather in the home, who are subjected to prolonged sexual abuse in childhood, or who are adopted from a developing country at a young age. Conversely, menarche may be slightly later when a girl grows up in a large family with a biological father present. More extreme degrees of environmental stress, such as wartime refugee status with threat to physical survival, have been found to be associated with delay of maturation, an effect that may be compounded by dietary inadequacy. Most of these reported social effects are small and our understanding is incomplete. Most of these "effects" are statistical associations revealed by epidemiologic surveys. Statistical associations are not necessarily causal, and a variety of covariables and alternative explanations can be imagined. Effects of such small size can never be confirmed or refuted for any individual child. Furthermore, interpretations of the data are politically controversial because of the ease with which this type of research can be used for political advocacy. Accusations of bias based on political agenda sometimes accompany scientific criticism. Another limitation of the social research is that nearly all of it has concerned girls, partly because female puberty requires greater physiologic resources and partly because it involves a unique event (menarche) that makes survey research into female puberty much simpler than male. More detail is provided in the menarche article. Variations of sequence The sequence of events of pubertal development can occasionally vary. For example, in about 15% of boys and girls, pubarche (the first pubic hairs) can precede, respectively,gonadarche and thelarche by a few months. Rarely, menarche can occur before other signs of puberty in a few girls. These variations deserve medical evaluation because they can occasionally signal a disease. Conclusion In a general sense, the conclusion of puberty is reproductive maturity. Criteria for defining the conclusion may differ for different purposes: attainment of the ability to reproduce, achievement of maximal adult height, maximal gonadal size, or adult sex hormone levels. Maximal adult height is achieved at an average age of 15 years for an average girl and 18 years for an average boy. Potential fertility (sometimes termed nubility) usually precedes

completion of growth by 1±2 years in girls and 3±4 years in boys. Stage 5 typically represents maximal gonadal growth and adult hormone levels. Neurohormonal process The endocrine reproductive system consists of the hypothalamus, the pituitary, the gonads, and the adrenal glands, with input and regulation from many other body systems. True puberty is often termed "central puberty" because it begins as a process of the central nervous system. A simple description of hormonal puberty is as follows: 1. The brain's hypothalamus begins to release pulses of GnRH. 2. Cells in the anterior pituitary respond by secreting LH and FSH into the circulation. 3. The ovaries or testes respond to the rising amounts of LH and FSH by growing and beginning to produce estradiol and testosterone. 4. Rising levels of estradiol and testosterone produce the body changes of female and male puberty. The onset of this neurohormonal process may precede the first visible body changes by 1±2 years. Components of the endocrine reproductive system The arcuate nucleus of the hypothalamus is the driver of the reproductive system. It has neurons which generate and release pulses of GnRH into the portal venous system of thepituitary gland. The arcuate nucleus is affected and controlled by neuronal input from other areas of the brain and hormonal input from the gonads, adipose tissue and a variety of other systems. The pituitary gland responds to the pulsed GnRH signals by releasing LH and FSH into the blood of the general circulation, also in a pulsatile pattern. The gonads (testes and ovaries) respond to rising levels of LH and FSH by producing the steroid sex hormones, testosterone and estrogen. The adrenal glands are a second source for steroid hormones. Adrenal maturation, termed adrenarche, typically precedes gonadarche in mid-childhood. Major hormones  Neurokinin B (a tachykinin peptide) and kisspeptin (a neuropeptide), both present in the same hypothalamic neurons, are critical parts of the control system that switches on the release of GnRH at the start of puberty.[52]  GnRH (gonadotropin-releasing hormone) is a peptide hormone released from the hypothalamus which stimulates gonadotrope cells of the anterior pituitary.  LH (luteinizing hormone) is a larger protein hormone secreted into the general circulation by gonadotrope cells of the anterior pituitary gland. The main target cells of LH are the Leydig cells of testes and the theca cells of the ovaries. LH secretion changes more dramatically with the initiation of puberty than FSH, as LH levels increase about 25-fold with the onset of puberty, compared with the 2.5-fold increase of FSH.  FSH (follicle stimulating hormone) is another protein hormone secreted into the general circulation by the gonadotrope cells of the anterior pituitary. The main target cells of FSH are the ovarian follicles and the Sertoli cells and spermatogenic tissue of the testes.  Testosterone is a steroid hormone produced primarily by the Leydig cells of the testes, and in lesser amounts by the theca cells of the ovaries and the adrenal cortex. Testosterone is the primary mammalian androgen and the "original" anabolic steroid. It acts on androgen receptors in responsive tissue throughout the body.  Estradiol is a steroid hormone produced by aromatization of testosterone. Estradiol is the principal human estrogen and acts on estrogen receptors throughout the body. The largest amounts of estradiol are produced by the granulosa cells of the ovaries, but lesser amounts are derived from testicular and adrenal testosterone.  Adrenal androgens are steroids produced by the zona reticulosa of the adrenal cortex in both sexes. The major adrenal androgens are dehydroepiandrosterone, androstenedione(which are precursors of testosterone),

and dehydroepiandrosterone sulfate which is present in large amounts in the blood. Adrenal androgens contribute to the androgenic events of early puberty in girls.  IGF1 (insulin-like growth factor 1) rises substantially during puberty in response to rising levels of growth hormone and may be the principal mediator of the pubertal growth spurt.  Leptin is a protein hormone produced by adipose tissue. Its primary target organ is the hypothalamus. The leptin level seems to provide the brain a rough indicator of adipose mass for purposes of regulation of appetite and energy metabolism. It also plays a permissive role in female puberty, which usually will not proceed until an adequate body mass has been achieved. Endocrine perspective The endocrine reproductive system becomes functional by the end of the first trimester of fetal life. The testes and ovaries become briefly inactive around the time of birth but resume hormonal activity until several months after birth, when incompletely understood mechanisms in the brain begin to suppress the activity of the arcuate nucleus. This has been referred to as maturation of the prepubertal "gonadostat," which becomes sensitive to negative feedback by sex steroids. The period of hormonal activity until several months after birth, followed by suppression of activity, may correspond to the period of infant sexuality, followed by a latency stage, which Sigmund Freud described.[53] Gonadotropin and sex steroid levels fall to low levels (nearly undetectable by current clinical assays) for approximately another 8 to 10 years of childhood. Evidence is accumulating that the reproductive system is not totally inactive during the childhood years. Subtle increases in gonadotropin pulses occur, and ovarian follicles surrounding germ cells (future eggs) double in number. Normal puberty is initiated in the hypothalamus, with de-inhibition of the pulse generator in the arcuate nucleus. This inhibition of the arcuate nucleus is an ongoing active suppression by other areas of the brain. The signal and mechanism releasing the arcuate nucleus from inhibition have been the subject of investigation for decades and remain incompletely understood.Leptin levels rise throughout childhood and play a part in allowing the arcuate nucleus to resume operation. If the childhood inhibition of the arcuate nucleus is interrupted prematurely by injury to the brain, it may resume pulsatile gonadotropin release and puberty will begin at an early age. Neurons of the arcuate nucleus secrete gonadotropin releasing hormone (GnRH) into the blood of the pituitary portal system. An American physiologist, Ernst Knobil, found that the GnRH signals from the hypothalamus induce pulsed secretion of LH (and to a lesser degree, FSH) at roughly 1-2 hour intervals. The LH pulses are the consequence of pulsatile GnRH secretion by the arcuate nucleus that, in turn, is the result of an oscillator or signal generator in the central nervous system ("GnRH pulse generator").[54] In the years preceding physical puberty, Robert M. Boyar discovered that the gonadotropin pulses occur only during sleep, but as puberty progresses they can be detected during the day.[55] By the end of puberty, there is little day-night difference in the amplitude and frequency of gonadotropin pulses. Some investigators have attributed the onset of puberty to a resonance of oscillators in the brain.[56][57][58][59] By this mechanism, the gonadotropin pulses that occur primarily at night just before puberty represent beats.[60][61][62] An array of "autoamplification processes" increases the production of all of the pubertal hormones of the hypothalamus, pituitary, and gonads[citation needed]. Regulation of adrenarche and its relationship to maturation of the hypothalamic-gonadal axis is not fully understood, and some evidence suggests it is a parallel but largely independent process coincident with or even preceding central puberty. Rising levels of adrenal androgens (termed adrenarche) can usually be detected between 6 and 11 years of age, even before the increasing gonadotropin pulses of hypothalamic puberty. Adrenal androgens contribute to the development of pubic hair (pubarche), adult body odor, and

which converts most of the testosterone to estradiol for secretion into the circulation.other androgenic changes in both sexes. Much of the testosterone moves into nearby cells called granulosa cells. producing the typical androgenic changes of female puberty: pubic hair. growth of the uterus. Another hormonal change in males takes place during the teenage years for most young men. though occurring about 1±2 years later. Smaller increases of FSH induce an increase in the aromatase activity of these granulosa cells. Levels of adrenal androgens and testosterone also increase during puberty. and most of the effects are mediated through the androgen receptors by way of conversion dihydrotestosterone in target organs (especially that of the bowels). increased thickness of the endometrium and the vaginal mucosa. Speaker recognition is the computing task of validating a user's claimed identity using characteristics extracted from their voices. breast growth. and widening of the lower pelvis. IGF1 levels rise and then decline as puberty ends. For much of puberty. bone maturation. However. This attainment of positive feedback is the hallmark of female sexual maturity. As the estradiol levels gradually rise and the other autoamplification processes occur. Rising levels of estradiol produce the characteristic estrogenic body changes of female puberty: growth spurt. acne. . body odor. Estradiol mediates the growth spurt. Stages  adrenarche (approximately age 7)  gonadarche (approximately age 8)  thelarche (approximately age 11 in females)  pubarche (approximately age 12)  menarche (approximately age 12. Growth finishes and adult height is attained as the estradiol levels complete closure of the epiphyses. Hormonal changes in girls As the amplitude of LH pulses increases. Hormonal changes in boys Early stages of male hypothalamic maturation seem to be very similar to the early stages of female puberty. LH stimulates the Leydig cells of the testes to make testosterone and blood levels begin to rise. the theca cells of the ovaries begin to produce testosterone and smaller amounts of progesterone. Regularity of frequency and amplitude of gonadotropin pulses seems to be less necessary for progression of male than female puberty. Growth hormone levels rise steadily throughout puberty.5 in females)  spermarche (in males) Speaker recognition . a point of maturation is reached when the feedback sensitivity of the hypothalamic "gonadostat" becomes positive. increased fat composition. other androgenic hair as outlined above. acceleration of bone maturation and closure. The primary clinical significance of the distinction between adrenarche and gonadarche is that pubic hair and body odor changes by themselves do not prove that central puberty is underway for an individual child. nighttime levels of testosterone are higher than daytime. Estradiol also induces at least modest development of breast tissue (gynecomastia) in a large proportion of boys. a significant portion of testosterone in adolescent boys is converted to estradiol. Boys who develop mild gynecomastia or even developing swellingsunder nipples during puberty are told the effects are temporary in some male teenagers due to high levels of estradiol. and epiphyseal closure in boys just as in girls. At this point in a males life the testosterone levels slowly rise. as it allows the mid cycle LH surge necessary for ovulation.

the speaker's voice is recorded and typically a number of features are extracted to form a voice print. alert automated systems of speaker changes.g. but it can reach high accuracy for individual voices it has been trained with. As text-independent technologies do not compare what . speaking style).. Voice recognition is combination of the two where it uses learned aspects of a speakers voice to determine what is being said ." Verification versus identification There are two major applications of speaker recognition technologies and methodologies. These two terms are frequently confused. For identification systems. Speaker verification is usually employed as a "gatekeeper" in order to provide access to a secure system (e. If the speaker claims to be of a certain identity and the voice is used to verify this claim.g. Because of the process involved. a police officer comparing a sketch of an assailant against a database of previously documented criminals to find the closest match(es) is an identification process.g. For example. Conversely. identification is different from verification. Speaker identification systems can also be implemented covertly without the user's knowledge to identify talkers in a discussion.: telephone banking). check if a user is already enrolled in a system. identification is the task of determining an unknown speaker's identity. presenting your passport at border control is a verification process . Text-independent systems are most often used for speaker identification as they require very little if any cooperation by the speaker. On the other hand..: passwords and PINs) or knowledge-based information can be employed in order to create a multi-factor authentication scenario. it is common to first perform a speaker identification process to create a list of "best matches" and then perform a series of verification processes to determine a conclusive match. prompts can either be common across all speakers (e. Speaker verification has earned speaker recognition its classification as a "behavioral biometric. In this case the text during enrollment and test is different. verification is faster than identification. as isvoice recognition. From a security perspective. If the text must be the same for enrollment and verification this is called text-dependent recognition. there is a difference between the act of authentication (commonly referred to as speaker verification or speaker authentication) and identification. size and shape of the throat and mouth) and learned behavioral patterns (e.such a system cannot recognise speech from random speakers very accurately. In forensic applications. These acoustic patterns reflect both anatomy (e. etc. the utterance is compared against multiple voice prints in order to determine the best match(es) while verification systems compare an utterance against a single voice print. the use of shared-secrets (e. In the verification phase.[citation needed] Variants of speaker recognition Each speaker recognition system has two phases: Enrollment and verification.: a common pass phrase) or unique. this is called verification or authentication.the agent compares your face to the picture in the document. a speech sample or "utterance" is compared against a previously created voice print. In a text-dependent system.g. In addition.g. In addition. These systems operate with the user's knowledge and typically requires their cooperation. template.There is a difference between speaker recognition (recognizing who is speaking) and speech recognition (recognizing what is being said). Speaker recognition systems fall into two categories: text-dependent and text-independent. In a sense speaker verification is a 1:1 match where one speaker's voice is matched to one template (also called a "voice print" or "voice model") whereas speaker identification is a 1:N match where the voice is compared against N templates. as in the case for many forensic applications. Speaker recognition has a history dating back some four decades and uses the acoustic features of speech that have been found to differ between individuals. the enrollment may happen without the user's knowledge. During enrollment. voice pitch. or model. In fact.

matrix representation. Technology The various technologies used to process and store voice prints include frequency estimation. neural networks.such a system cannot recognise speech from random speakers very accurately. identification is the task of determining an unknown speaker's identity.g. verification applications tend to also employ speech recognition to determine what the user is saying at the point of authentication.. Integration with two-factor authentication products is expected to increase.was said at enrollment and verification. though there is debate regarding the overall security impact imposed by automated adaptation. Performance degradation can result from changes in behavioural attributes of the voice and from enrolment using one telephone and verification on another telephone ("cross channel"). but it can reach high accuracy for individual voices it has been trained with. Digitally recorded audio voice identification and analogue recorded voice identification uses electronic measurements as well as critical listening skills that must be applied by a forensic expert in order for the identification to be accurate. Noise reduction algorithms can be employed to improve accuracy. Gaussian mixture models. In a sense speaker verification is a 1:1 match where one speaker's voice is matched to one template (also called a "voice print" or "voice model") whereas speaker identification is a 1:N match where the voice is compared against N templates. From a security perspective.Vector Quantization and decision trees. such as cohort models. Ambient noise levels can impede both collection of the initial and subsequent voice samples. identification is different from verification. Speaker recognition Speaker recognition is the computing task of validating a user's claimed identity using characteristics extracted from their voices. For example. . size and shape of the throat and mouth) and learned behavioral patterns (e. Some systems also use "anti-speaker" techniques. These two terms are frequently confused. These acoustic patterns reflect both anatomy (e. a police officer comparing a sketch of an assailant against a database of previously documented criminals to find the closest match(es) is an identification process.g. Conversely." Verification versus identification There are two major applications of speaker recognition technologies and methodologies. Voice recognition is combination of the two where it uses learned aspects of a speakers voice to determine what is being said . On the other hand. as isvoice recognition. there is a difference between the act of authentication (commonly referred to as speaker verification or speaker authentication) and identification. There is a difference between speaker recognition (recognizing who is speaking) and speech recognition (recognizing what is being said). Speaker recognition has a history dating back some four decades and uses the acoustic features of speech that have been found to differ between individuals. If the speaker claims to be of a certain identity and the voice is used to verify this claim.the agent compares your face to the picture in the document. Capture of the biometric is seen as non-invasive. hidden Markov models. voice pitch. but incorrect application can have the opposite effect. speaking style). and world models. this is called verification or authentication. The technology traditionally uses existing microphones and voice transmission technology allowing recognition over long distances via ordinary telephones (wired or wireless). In addition. Voice changes due to ageing may impact system performance over time. Speaker verification has earned speaker recognition its classification as a "behavioral biometric. presenting your passport at border control is a verification process . Some systems adapt the speaker models after each successful verification to capture such long-term changes in the voice. pattern matching algorithms..

In addition. Capture of the biometric is seen as non-invasive. check if a user is already enrolled in a system. In the verification phase. pattern matching algorithms. as in the case for many forensic applications. Integration with two-factor authentication products is expected to increase.Vector Quantization and decision trees. the use of shared-secrets (e.g. the speaker's voice is recorded and typically a number of features are extracted to form a voice print.: passwords and PINs) or knowledge-based information can be employed in order to create a multi-factor authentication scenario. the utterance is compared against multiple voice prints in order to determine the best match(es) while verification systems compare an utterance against a single voice print. Speaker identification systems can also be implemented covertly without the user's knowledge to identify talkers in a discussion. Voice changes due to ageing may impact system performance over time.Speaker verification is usually employed as a "gatekeeper" in order to provide access to a secure system (e.: a common pass phrase) or unique. Some systems adapt the speaker models after each successful verification to capture such long-term changes in the voice. . neural networks. hidden Markov models.g. Gaussian mixture models. Technology The various technologies used to process and store voice prints include frequency estimation. but incorrect application can have the opposite effect. The technology traditionally uses existing microphones and voice transmission technology allowing recognition over long distances via ordinary telephones (wired or wireless). verification is faster than identification. Noise reduction algorithms can be employed to improve accuracy. verification applications tend to also employ speech recognition to determine what the user is saying at the point of authentication. and world models. matrix representation. etc. If the text must be the same for enrollment and verification this is called text-dependent recognition. Some systems also use "anti-speaker" techniques.Speaker recognition systems fall into two categories: text-dependent and text-independent. it is common to first perform a speaker identification process to create a list of "best matches" and then perform a series of verification processes to determine a conclusive match. Ambient noise levels can impede both collection of the initial and subsequent voice samples. Text-independent systems are most often used for speaker identification as they require very little if any cooperation by the speaker. such as cohort models. Because of the process involved. In a text-dependent system. the enrollment may happen without the user's knowledge. Performance degradation can result from changes in behavioural attributes of the voice and from enrolment using one telephone and verification on another telephone ("cross channel"). As text-independent technologies do not compare what was said at enrollment and verification. though there is debate regarding the overall security impact imposed by automated adaptation. In this case the text during enrollment and test is different. For identification systems.: telephone banking). prompts can either be common across all speakers (e. In fact.g. In forensic applications. During enrollment. or model. These systems operate with the user's knowledge and typically requires their cooperation. template. a speech sample or "utterance" is compared against a previously created voice print. alert automated systems of speaker changes.[citation needed] Variants of speaker recognition Each speaker recognition system has two phases: Enrollment and verification. Digitally recorded audio voice identification and analogue recorded voice identification uses electronic measurements as well as critical listening skills that must be applied by a forensic expert in order for the identification to be accurate.

For specific usage domains. Homer Dudley refined this device into the VODER. pre-processing. described in a 1791 paper. and Roger Bacon (1214±1294). Systems differ in the size of the stored speech units. like phrases. which is then imposed on the output speech. The back-end²often referred to as the synthesizer²then converts the symbolic linguistic representation into sound. and divides and marks the text into prosodic units. A text-to-speech (TTS) system converts normal language text into speech. [i ]. it converts raw text containing symbols like numbers and abbreviations into the equivalent of written-out words. Many computer operating systems have included speech synthesizers since the early 1980s. This machine added models of the tongue and lips. [e ]. a keyboard-operated electronic speech analyzer and synthesizer that was said to be clearly intelligible. In the 1930s. 1003 AD). [o ] and [u ]). which he exhibited at the 1939 New York World's Fair. the storage of entire words or sentences allows for high-quality output. In certain systems. Austria. built models of the human vocal tract that could produce the five long vowel sounds (inInternational Phonetic Alphabet notation. . The front-end then assigns phonetic transcriptions to each word. Albertus Magnus (1198±1280). Overview of text processing Overview of a typical TTS system A text-to-speech system (or "engine") is composed of two parts: a front-end and a backend. First. A computer system used for this purpose is called a speech synthesizer. The quality of a speech synthesizer is judged by its similarity to the human voice and by its ability to be understood. In 1779. Charles Wheatstone produced a "speaking machine" based on von Kempelen's design. In 1837. This was followed by the bellows-operated "acousticmechanical speech machine" by Wolfgang von Kempelen of Vienna. M. The front-end has two major tasks. An intelligible text-to-speech program allows people with visual impairments or reading disabilities to listen to written works on a home computer. the Danish scientist Christian Kratzenstein. but may lack clarity. transcriptions and prosody information together make up the symbolic linguistic representation that is output by the front-end. Some early legends of the existence of "speaking heads" involved Gerbert of Aurillac (d. and sentences. clauses. Wheatstone's design was resurrected in 1923 by Paget. This process is often called text normalization. enabling it to produce consonants as well as vowels. Alternatively. a synthesizer can incorporate a model of the vocal tract and other human voice characteristics to create a completely "synthetic" voice output. Bell Labs developed the VOCODER.Speech synthesis Speech synthesis is the artificial production of human speech. working at the Russian Academy of Sciences. there were those who tried to build machines to create human speech. this part includes the computation of the target prosody (pitch contour. Faber built the "Euphonia". Synthesized speech can be created by concatenating pieces of recorded speech that are stored in a database. they are [a ]. phoneme durations). The process of assigning phonetic transcriptions Phonetic to words is called text-to-phonemeor grapheme-to-phoneme conversion. and can be implemented in software orhardware. and in 1857. or tokenization. a system that stores phones or diphones provides the largest output range. History Long before electronic signal processing was invented. other systems render symbolic linguistic representations like phonetic transcriptions into speech.

making extensive use of Natural Language Processing methods. the division into segments is done using a specially modified speech recognizer set to a "forced alignment" mode with some manual correction afterward. Concatenative synthesis Concatenative synthesis is based on the concatenation (or stringing together) of segments of recorded speech. Early electronic speech synthesizers sounded robotic and were often barely intelligible. an event among the most prominent in the history of Bell Labs. Franklin S. Clarke was so impressed by the demonstration that he used it in the climactic scene of his screenplay for his novel 2001: A Space Odyssey. Kelly's voice recorder synthesizer (vocoder) recreated the song "Daisy Bell". However. Arthur C. words. Alvin Liberman and colleagues were able to discover acoustic cues for the perception of phonetic segments (consonants and vowels). Electronic devices The first computer-based speech synthesis systems were created in the late 1950s. At runtime. Coincidentally. using visual representations such as the waveform and spectrogram. The two primary technologies for generating synthetic speech waveforms are concatenative synthesis and formant synthesis. while intelligibility is the ease with which the output is understood. each recorded utterance is segmented into some or all of the following: individual phones.The Pattern playback was built by Dr. Generally. Clarke was visiting his friend and colleague John Pierce at the Bell Labs Murray Hill facility. research is still being conducted into mechanical speech synthesizers. and the first complete text-to-speech system was completed in 1968.diphones. based largely on the work of Dennis Klatt at MIT. The machine converts pictures of the acoustic patterns of speech in the form of a spectrogram back into sound. Unit selection synthesis Unit selection synthesis uses large databases of recorded speech. Using this device. Jr and colleague Louis Gerstman used an IBM 704 computer to synthesize speech. and the intended uses of a synthesis system will typically determine which approach is used. and the Bell Labs system. An index of the units in the speech database is then created based on the segmentation and acoustic parameters like the fundamental frequency (pitch).where the HAL 9000 computer sings the same song as it is being put to sleep by astronaut Dave Bowman. differences between natural variations in speech and the nature of the automated techniques for segmenting the waveforms sometimes result in audible glitches in the output. Typically. the desired target utterance is created by determining the best chain of candidate units from the database (unit selection). but output from contemporary speech synthesis systems is still clearly distinguishable from actual human speech. half-phones. As the cost-performance ratio causes speech synthesizers to become cheaper and more accessible to the people. The quality of synthesized speech has steadily improved. Each technology has strengths and weaknesses. position in the syllable. the latter was one of the first multilingual language-independent systems. The ideal speech synthesizer is both natural and intelligible. This process is typically achieved using a specially weighted decision tree. and neighboring phones. concatenative synthesis produces the most natural-sounding synthesized speech. There are three main sub-types of concatenative synthesis. Naturalness describes how closely the output sounds like human speech. Dominant systems in the 1980s and 1990s were the MITalk system. phrases. physicist John Larry Kelly. During database creation. syllables. more people will benefit from the use of text-to-speech programs. There were several different versions of this hardware device but only one currently survives. Synthesizer technologies The most important qualities of a speech synthesis system are naturalness and intelligibility. . and sentences. In 1961. duration. Despite the success of purely electronic speech synthesis. Cooper and his colleagues at Haskins Laboratories in the late 1940s and completed in 1950. morphemes. with musical accompaniment from Max Mathews. Speech synthesis systems usually try to maximize both characteristics.

At runtime. Parameters such as fundamental frequency. and German about 2500. although it continues to be used in research because there are a number of freely available software implementations.PSOLA or MBROLA. This method is sometimes called rules-based synthesis. and they closely match the prosody and intonation of the original recordings. because it applies only a small amount of digital signal processing (DSP) to the recorded speech. many final consonants become no longer silent if followed by a word that begins with a vowel. they are not general-purpose and can only synthesize the combinations of words and phrases with which they have been preprogrammed. The blending of words within naturally spoken language however can still cause problems unless the many variations are taken into account. For example. however. only one example of each diphone is contained in the speech database. representing dozens of hours of speech. but more natural-sounding than the output of formant synthesizers.g. and has been in commercial use for a long time. However. The level of naturalness of these systems can be very high because the variety of sentence types is limited. The output from the best unit-selection systems is often indistinguishable from real human voices. in devices like talking clocks and calculators. Many systems based on formant synthesis technology generate artificial. which would require additional complexity to be context-sensitive. In diphone synthesis. "clear out" is realized as / kli t/). DSP often makes recorded speech sound less natural. It is used in applications where the variety of texts the system will output is limited to a particular domain.Unit selection provides the greatest naturalness. the target prosody of a sentence is superimposed on these minimal units by means of digital signal processing techniques such as linear predictive coding. although some systems use a small amount of signal processing at the point of concatenation to smooth the waveform. in some systems ranging into the gigabytes of recorded data. maximum naturalness typically require unitselection speech databases to be very large. an effect called liaison. Formant- . Diphone synthesis Diphone synthesis uses a minimal speech database containing all the diphones (sound-tosound transitions) occurring in a language. The quality of the resulting speech is generally worse than that of unit-selection systems. Domain-specific synthesis Domain-specific synthesis concatenates prerecorded words and phrases to create complete utterances. and noise levels are varied over time to create a waveform of artificial speech. many concatenative systems also have rules-based components. the synthesized speech output is created using additive synthesis and an acoustic model (physical modelling synthesis)[20]. maximum naturalness is not always the goal of a speech synthesis system. Instead. its use in commercial applications is declining. unit selection algorithms have been known to select segments from a place that results in less than ideal synthesis (e. minor words become unclear) even when a better choice exists in the database. and formant synthesis systems have advantages over concatenative systems. especially in contexts for which the TTS system has been tuned. However. and has few of the advantages of either approach other than small size. Likewise in French.[19] The technology is very simple to implement. The number of diphones depends on thephonotactics of the language: for example. Also.g. voicing. Spanish has about 800 diphones. robotic-sounding speech that would never be mistaken for human speech.[citation needed] Because these systems are limited by the words and phrases in their databases. like transit schedule announcements or weather reports. As such. in non-rhotic dialects of English the "r" in words like "clear" / kli / is usually only pronounced when the following word has a vowel as its first letter (e. Formant synthesis Formant synthesis does not use human speech samples at runtime. Diphone synthesis suffers from the sonic glitches of concatenative synthesis and the robotic-sounding nature of formant synthesis. This alternation cannot be reproduced by a simple word-concatenation system.

and duration (prosody) of speech are modeled simultaneously by HMMs. or computationally effective. High-speed synthesized speech is used by the visually impaired to quickly navigate computers using a screen reader.[21] and in many Atari.[23] Articulatory synthesis Articulatory synthesis refers to computational techniques for synthesizing speech based on models of the human vocal tract and the articulation processes occurring there. where memory andmicroprocessor power are especially limited. various heuristic techniques are used to guess the proper way to disambiguate homographs. provides full articulatory-based text-to-speech conversion using a waveguide or transmission-line analog of the human oral and nasal tracts controlled by Carré's "distinctive region model". In this system. the frequency spectrum (vocal tract). even at very high speeds. Texts are full of heteronyms. a spin-off company of the University of Calgary. Cecil Coker. arcade games[22] using the TMS5220 LPC Chips. and in the early 1980s Sega arcade machines. As a result. Tom Baer. They can therefore be used in embedded systems. Most text-to-speech (TTS) systems do not generate semantic representations of their input texts. numbers. and colleagues. articulatory synthesis models have not been incorporated into commercial speech synthesis systems. Creating proper intonation for these projects was painstaking.fundamental frequency (vocal source). HMM-based synthesis HMM-based synthesis is a synthesis method based on hidden Markov models. This synthesizer. first marketed in 1994. For example. Examples of non-real-time but highly accurate intonation control in formant synthesis include the work done in the late 1970s for the Texas Instruments toy Speak & Spell. Recently TTS systems have begun to use HMMs (discussed above) to generate "parts of speech" to aid in disambiguating homographs. as processes for doing so are not reliable. and abbreviations that all require expansion into a phonetic representation. conveying not just questions and statements. avoiding the acoustic glitches that commonly plague concatenative systems. Because formant-based systems have complete control of all aspects of the output speech. with work continuing as gnuspeech. This technique is quite successful for many .[24] Sinewave synthesis Sinewave synthesis is a technique for synthesizing speech by replacing the formants (main bands of energy) with pure tone whistles. The system. Speech waveforms are generated from HMMs themselves based on themaximum likelihood criterion. the Trillium software was published under the GNU General Public License.synthesized speech can be reliably intelligible. A notable exception is the NeXT-based system originally developed and marketed by Trillium Sound Research. also called Statistical Parametric Synthesis.[25] Challenges Text normalization challenges The process of normalizing text is rarely straightforward. There are many spellings in English which are pronounced differently based on context. Following the demise of the various incarnations of NeXT (started by Steve Jobs in the late 1980s and merged with Apple Computer in 1997). a wide variety of prosodies and intonations can be output. was based on vocal tract models developed at Bell Laboratories in the 1960s and 1970s by Paul Mermelstein. and the results have yet to be matched by real-time text-to-speech interfaces. Formant synthesizers are usually smaller programs than concatenative systems because they do not have a database of speech samples. "My latest project is to learn how to better project my voice" contains two pronunciations of "project". well understood. where much of the original research was conducted. and Paul Mermelstein. Inc. known as ASY. but a variety of emotions and tones of voice. The first articulatory synthesizer regularly used for laboratory experiments was developed at Haskins Laboratories in the mid-1970s by Philip Rubin. Until recently. like examining neighboring words and using statistics about frequency of occurrence.

or as "reed" implying present tense. numbers. Similarly. Languages with a phonemic orthography have a very regular writing system. Text-to-phoneme challenges Speech synthesis systems use two basic approaches to determine the pronunciation of a word based on its spelling. resulting in nonsensical (and sometimes comical) outputs. although access to required training corpora is frequently difficult in these languages. "thirteen twentyfive" or "thirteen hundred and twenty five". whose pronunciations are not obvious from their spellings. abbreviations can be ambiguous. These techniques also work well for most European languages. Different organizations often use different speech data. approach to learning reading. or synthetic phonics.) As a result." uses the same abbreviation for both "Saint" and "Street". are more likely to rely on dictionaries. speech synthesis systems for languages like English. or words that aren't in their dictionaries. which have extremely irregular spelling systems. so too does the memory space requirements of the synthesis system. The dictionary-based approach is quick and accurate. Typical error rates when using HMMs in this fashion are usually below five percent. but completely fails if it is given a word which is not in its dictionary. For example. where a large dictionary containing all the words of a language and their correct pronunciations is stored by the program. in which pronunciation rules are applied to words to determine their pronunciations based on their spellings. On the other hand. a process which is often called text-to-phoneme or grapheme-tophoneme conversion (phoneme is the term used by linguists to describe distinctive sounds in a language). Each approach has advantages and drawbacks. numbers occur in many different contexts. "1325" may also be read as "one three two five". Deciding how to convert numbers is another problem that TTS systems have to address. and punctuation. . like foreign names and borrowings. Determining the correct pronunciation of each word is a matter of looking up each word in the dictionary and replacing the spelling with the pronunciation specified in the dictionary. The quality of speech synthesis systems also depends to a large degree on the quality of the production technique (which may involve analogue or digital recording) and on the facilities used to replay the speech. and to use rule-based methods only for unusual words. It is a simple programming challenge to convert a number into words (at least in English). Evaluation challenges The consistent evaluation of speech synthesis systems may be difficult because of a lack of universally agreed objective evaluation criteria. yet is the only word in which the letter "f" is pronounced [v]. while "Chapter VIII" reads as "Chapter Eight". Evaluating speech synthesis systems has therefore often been compromised by differences between production techniques and replay facilities. On the other hand. The simplest approach to text-to-phoneme conversion is the dictionarybased approach. but the complexity of the rules grows substantially as the system takes into account irregular spellings or pronunciations. nearly all speech synthesis systems use a combination of these approaches. Speech synthesis systems for such languages often use the rule-based method extensively. and the address "12 St John St. TTS systems with intelligent front ends can make educated guesses about ambiguous abbreviations. and the prediction of the pronunciation of words based on their spellings is quite successful. resorting to dictionaries only for those few words." However. and sometimes the system provides a way to specify the context if it is ambiguous.[citation needed]As dictionary size grows. like "1325" becoming "one thousand three hundred twenty-five. A TTS system can often infer how to expand a number based on surrounding words. The other approach is rule-based. the abbreviation "in" for "inches" must be differentiated from the word "in". For example "Henry VIII" reads as "Henry the Eighth". the rule-based approach works on any input. (Consider that the word "of" is very common in English.[26] Roman numerals can also be read differently depending on context. while others provide the same result in all cases. This is similar to the "sounding out".cases such as whether "read" should be pronounced as "red" implying past tense.

Since the 1980s Macintosh Computers offered text to speech capabilities through The MacinTalk software. UK. some researchers have started to evaluate speech synthesis systems using a common speech dataset.tos" on floppy disk.[27] Prosodics and emotional content A recent study reported in the journal "Speech Communication" by Amy Drahota and colleagues at the University of Portsmouth. the first speech system integrated into an operating system was the 1400XL/1450XL personal computers designed by Atari.com TTS256)  Savage Innovations SoundGin  National Semiconductor DT1050 Digitalker (Mozer)  Silicon Systems SSI 263 (analog formant)  Texas Instruments LPC Speech Chips  TMS5110A  TMS5200  Oki Semiconductor  ML22825 (ADPCM)  ML22573 (HQADPCM)  Toshiba T6721A  Philips PCF8200  TextSpeak Embedded TTS Modules Computer operating systems or outlets with speech synthesis Atari Arguably. Inc. In the early 1990s Apple expanded its capabilities offering system wide text-to-speech support. During 10. Apple has added sample-based voices. Starting with 10. TheAppleScript Standard Additions includes a say verb that allows a script to use any of the installed voices and to control the pitch. Mac OS X also includes say.4 (Tiger) & first releases of 10. Unfortunately. speaking rate and modulation of the spoken text.6 (Snow Leopard). the 1400XL/1450XL personal computers never shipped in quantity.5 (Leopard) there was only one standard voice shipping with Mac OS X. With the introduction of faster PowerPC-based computers they included higher quality voice sampling. Apple also introduced speech recognition into its systems which provided a fluid command set. using the Votrax SC01 chip in 1983. VoiceOver was for the first time featured in Mac OS X Tiger (10. for people with vision problems.speechchips. Dedicated hardware  Votrax  SC-01A (analog formant)  SC-02 / SSI-263 / "Arctic 263"  General Instruments SP0256-AL2 (CTS256A-AL2. More recently. the user can choose out of a wide range list of multiple voices. Starting as a curiosity.4). VoiceOver voices feature the taking of realistic-sounding breaths between sentences. at better than chance levels. the speech system of Apple Macintosh has evolved into a fully-supported program. Apple The first speech system integrated into an operating system that shipped in quantity was Apple Computer's MacInTalk in 1984. however.Recently. AmigaOS . The 1400XL/1450XL computers used a Finite State Machine to enable World English Spelling text-to-speech synthesis[29]. The Atari ST computers were sold with "stspeech. MEA8000)  Magnevation SpeakJet (www. PlainTalk. reported that listeners to voice recordings could determine. as well as improved clarity at high read rates over PlainTalk. whether or not the speaker was smiling. a command-line based application that converts text to audible speech.[28] It was suggested that identification of the vocal features which signal emotional content may be used to help make synthesized speech sound more natural.

AmigaOS considered speech synthesis a virtual hardware device. e. introduced in 1985. such as word processors. 'Browsealoud' from a UK company and Readspeaker. convenience. However. and includes smart delivery technology to ensure only what is seen is spoken and the content is logically pathed. on-line RSS-readers are available on almost any PC connected to the Internet. Users can download generated audio files to portable devices. with a help of podcast receiver. Some specialized software can narrate RSSfeeds.TO. Inc. available through menus once installed on the system. Some Amiga programs. e. Additional engines (often use a certain jargon or vocabulary) are also available through third-party manufacturers.0 was available on Microsoft-based operating systems as a third-party add-on for systems like Windows 95 and Windows 98. there are a number of applications. now Softvoice.[32] Internet The most recent TTS development in the web browser. made extensive use of the speech system. made possible by advanced features of the Amiga hardware audio chipset. which ports the Flite C engine to pure JavaScript. Windows 2000 added a speech synthesis program called Narrator. SAPI 4. Microsoft Speech Server is a complete package for voice synthesis and recognition. Currently. Amiga Speak Handler featured a text-to-speech translator. It can deliver TTS functionality to anyone (for reasons of accessibility. plugins and gadgets that can read messages directly from an e-mail client and web pages from a web browser or Google Toolbar such asText-to-voice which is an add-on to Firefox . All Windows-compatible programs could make use of speech synthesis features.) and it featured a complete system of voice emulation.g. is the JavaScript Text to Speech work of Yury Delendik. directly available to users. The voice synthesis was licensed by Commodore International from a third-party software house (Don't Ask Software. much work is being done in the context of the W3C to move this technology into the mainstream browser market through the W3C Audio Incubator Group with the involvement of The BBC and Google Inc. and listen to them while walking.[31] Android Version 1. Additionally SPEAK. TTS is the ability of the operating system to play back printed text as spoken words.ME from Oxford Information Laboratories is capable of delivering text to speech through any browser without the need to download any special applications. so the user could even redirect console output to it.and SAPI5-based speech systems that include a speech recognition engine (SRE).6 of Android added support for speech synthesis (TTS). for commercial applications such as call centers. On the other hand. The ability to use Yury's TTS port currently requires a custom browser build that uses Mozilla's Audio-Data-API.[31] An internal (installed with the operating system) driver (called a TTS engine): recognizes the text and using a synthesized voice (chosen from several pre-generated voices) speaks the written text. On one hand.[30] It was divided into a narrator device and a translator library. with both male and female voices and "stress" indicator markers. A growing field in internet based TTS is web-based assistive technology. entertainment or information) with access to a web browser. Microsoft Windows Modern Windows systems use SAPI4. Text-to-Speech (TTS) capabilities for a computer refers to the ability to play back text in a spoken voice.g.The second operating system with advanced speech synthesis capabilities was AmigaOS. Others  Some models of Texas Instruments home computers produced in 1979 and 1981 (Texas Instruments TI-99/4 and TI-99/4A) were capable of text-to-phoneme synthesis or . This allows web pages to convert text to audio using HTML5 technology. jogging or commuting to work. online RSS-narrators simplify information delivery by allowing users to listen to their favourite news sources and to convert them to podcasts.

anime and similar. Applications Speech synthesis has long been a vital assistive technology tool and its application in this area is significant and widespread. That includes professions such as teachers. which can be used for mobile applications. Lernout & Hauspie (bought by Nuance). includes tags related to speech recognition. RSS feeds and web pages for news stories. a precursor to IBM ViaVoice.[37] Vocal loading Vocal loading is the stress inflicted on the speech organs when speaking for long periods.reciting complete words and phrases (text-to-dictionary). explicitly geared towards customers in the entertainment industries. In 2007. about 15% have professions where their voice is their primary tool. Older speech synthesis markup languages include Java Speech Markup Language (JSML) and SABLE. and gnuspeech which uses articulatory synthesis[34] from the Free Software Foundation. none of them has been widely adopted. Eloquent Technology (bought by SpeechWorks). using a very popular Speech Synthesizer peripheral. Speech synthesis techniques are used as well in the entertainment productions such as games. but text-to-speech systems are now commonly used by people with dyslexia and other reading difficulties as well as by preliterate children. and educational texts for enhanced learning. The longest application has been in the use of screen readers for people with visual impairment. Rhetorical Systems (bought by Nuance). Although each of these was proposed as a standard. which became a W3C recommendation in 2004. YAKiToMe! is also used to convert entire books for personal podcasting purposes. They are also frequently employed to aid those with severe speech impairment usually through a dedicated voice output communication aid. sales personnel. primarily video games. for example.  Systems that operate on free and open source software systems including Linux are various. Speech synthesis markup languages are distinguished from dialogue markup languages. Background Of the working population.[36] TTS applications such as YAKiToMe! and Speakonia are often used to add synthetic voices to YouTube videos for comedic effect. The most recent is Speech Synthesis Markup Language (SSML). dialogue management and touchtone dialing. Speech synthesis markup languages A number of markup languages have been established for the rendition of text as speech in an XML-compliant format. actors and singers. Sites such as Ananova and YAKiToMe! have used speech synthesis to convert written news to audio content.  Companies which developed speech synthesis systems but which are no longer in this business include BeST Speech (bought by L&H). and include open-source programs such as the Festival Speech Synthesis Systemwhich uses diphone-based synthesis (and can use a limited number of MBROLA voices). as in Barney Bunch videos. when NEC Biglobe announced a web service that allows users to create phrases from the voices of Code Geass: Lelouch of the Rebellion R2 characters. Software such as Vocaloid can generate singing voices via lyrics and melody. VoiceXML. and TV . in addition to text-to-speech markup. It allows environmental barriers to be removed for people with a wide range of disabilities. Animo Limited announced the development of a software application package based on its speech synthesis software FineSpeech. This is also the aim of the Singing Computer project (which uses GNU LilyPond and Festival) to help blind people check their lyric input. SpeechWorks (bought by Nuance).[33]  IBM's OS/2 Warp 4 included VoiceType. able to generate narration and lines of dialogue according to user specifications. TI used a proprietary codec to embed complete spoken phrases into applications.[35] The application reached maturity in 2008.

Symptoms Objective evaluation or measurement of vocal loading is very difficult due to the tight coupling of the experienced psychological and physiological stress. especially teachers. Similarly. Effect of speaking environment Several studies in vocal loading show that the speaking environment does have a significant impact on vocal loading. Pitch range indicates the possible pitches that can be spoken. When a voice is loaded. In addition.dehydration increases effects of stress inflicted on the vocal folds  background noise . These changes in pressure form the waves called (voiced) speech. both in the US and the European Union. That means that for voiced sounds the vocal folds will hit together 110 or 220 times a second. smoking and other types of air pollution might have a negative effect on voice production organs. there are some typical symptoms that can be objectively measured. research in vocal loading has often been treated as a minor subject. the upper pitch limit will decrease and the lower pitch limit will rise. In a larger scope. The folds will then hit together more than 30 thousand times an hour. if not possible. setting them into an oscillating movement. the volume range will decrease. respectively. This is encompassed in the study of vocology. a normal. the exact details are debated. the voice should be amplified. . Firstly. the question arises of how one should use one's voice to minimise tiring in the vocal organs. both properties are difficult to measure objectively. this involves millions of sick-leave days every year. Any excess force used when speaking will increase tiring. However. for example.people tend to speak louder when background noise is present. the science and practice of voice habilitation. Suppose then that a female is speaking continuously for an hour. However. These include all kinds of muscular strain in the speech organs. Unfortunately. Voice care Regularly. in both speech and singing. No background noise should be present or. Still. researchers' largest interest lies in stress exerted on the vocal folds.and radio reporters. Basically. an increase in the hoarseness and strain of a voice can often be heard. It is intuitively clear that the vocal fold tissue will experience some tiring due to this large number of hits.the "normal" speaking style has close to optimal pitch. In every oscillation. the pitch range of the voice will decrease. Loading on tissue in vocal folds The fundamental frequency of speech for an average male is around 110Hz and for an average female around 220Hz. Many of them. Still. suffer from voice-related medical problems. Increasing speaking volume increases stress inflicted on the vocal folds  pitch . Voice organ Voiced speech is produced by air streaming from the lungs through the vocal cords. The speaker should drink enough water and the air humidity level should be normal or higher. even when it isn't necessary. Using a higher or lower pitch than normal will also increase stress in the speech organs. When the folds reopen the pressure under the folds is released. Secondly. Most scientists agree on the effect of the following environmental properties:  air humidity . Of this time perhaps five minutes is voiced speech. Smoking is discouraged. similarly as usage of any other muscles will experience strain if used for an extended period of time.dry air increases stress experienced in the vocal folds  hydration . and only perceptual evaluations can be performed. the vocal folds are closed for a short period of time. Vocal loading also includes other kinds of strain on the speech organs. relaxed way of speech is the optimal method for voice production.

only the usable range of the modal register. An opera singer would therefore only be able to include the notes that they are able to adequately project over an orchestra within their vocal range. the modal register. For example. All of these factors combined are used to categorize a singer's voice into a particular kind of singing voice or voice type. the register used in normal speech and most singing.Vocal rest Vocal rest is the process of resting the vocal folds by not speaking or singing. These voice types would therefore include the notes from these other registers within their vocal range. A voice type is a particular kind of human singing voice perceived as having certain identifying qualities or characteristics. within opera all singers must project over an orchestra without the aid of a microphone. Singing and the definition of vocal range While the broadest definition of vocal range is simply the span from the lowest to the highest note a particular voice can produce. and speech and language pathology. The purpose of vocal rest is to hasten recovery time. While the exact number and definition of vocal registers is a controversial topic within the field of singing. a pop artist could include notes that could be heard with the aid of a microphone. physical characteristics. vocal timbre. will significantly decrease recovery time after a cold. In contrast. It is believed that vocal rest. Another factor to consider is the use of different forms of vocal production. For example. vocal tessitura. that if one needs to communicate one should speak and not whisper. Although the study of vocal range has little practical application in terms of speech. particularly in relation to the study of tonal languages and certain types of vocal disorders. while others hold that whispering can cause additional stress to the larynx. it is a topic of study within linguistics. These different forms of voice production are known as vocal registers. and vocal registration. the falsetto register. The human voice is capable of producing sounds using different physiological processes within the larynx. Vocal pedagogists tend to define the vocal range as the total span of "musically useful" pitches that a singer can produce. The reasons for this differ. Vocal range Vocal range is the measure of the breadth of pitches that a human voice can phonate. vocal range being only one of those characteristics. . some believe that whispering merely does not allow the voice to rest and may have a dehydrating effect. in most cases only the usable pitches within the modal register are included when determining a singer's vocal range. However. however. phonetics. However. However. scientific testing. It is generally believed. Vocal range and voice classification Vocal range plays such an important role in classifying singing voices into voice types that sometimes the two terms are confused with one another. and the vocal fry register. there are some instances where other vocal registers are included. is used when determining vocal range. along with rehydration. Typically. countertenors utilize falsetto often and coloratura sopranos utilize the whistle register frequently. speech level. this broad definition is often not what is meant when "vocal range" is discussed in the context of singing. the sciences identify only four registers: the whistle register. the most common application of the term "vocal range" is within the context of singing. This is because some of the notes a voice can produce may not be considered usable by the singer within performance for various reasons. where it is used as one of the major defining characteristics for classifying singing voices into groups known as voice types. such as thecommon cold or influenza. vocal transition points. within opera. which typically follows vocal disorders or viral infections which cause hoarseness in the voice. Another example would be a male doo-wop singer who might quite regularly deploy his falsetto pitches in performance and thus include them in determining his range. Other factors are vocal weight.

E2 to C6) for normal male and female voices together. this rare note is also heard in the opera Esclarmonde by Jules Massenet. or the characteristic sound of the singing voice. When considering the pre-pubescent voices of children an eighth term. mein lieber Sohn" in Mozart's opera Die Zauberflöte. regardless of the size of their vocal range. can be applied. and contralto. Therefore. are sub-types that fall under seven different major voice categories that are for the most part acknowledged across all of the major voice classification systems. The following are the general vocal ranges associated with each voice type using scientific pitch notation where middle C=C4. Vocal range itself can not determine a singer's voice type. A voice teacher. tenor. Leonard Bernstein composed an optional B1 (a minor third below . A voice teacher would therefore look to see whether or not the singer were more comfortable singing up higher or singing lower. human voices are roughly in the range of 80 Hz to 1100 Hz (that is. however. Sopranos tend to have a lighter and less rich vocal sound than a mezzosoprano. Mozart.mezzo-soprano. A. More important than range in voice classification is tessitura. reaching C7. Men are usually divided into four groups: countertenor. who was known for her exceptionally high voice. Though pitch standards were not fixed in the eighteenth century. Some singers within these voice types may be able to sing somewhat higher or lower:  Soprano: C4 ± C6  Mezzo-soprano: A3 ± A5  Contralto: F3 ± F5  Tenor: C3 ± C5  Baritone: F2 ± F4  Bass: E2 ± E4 In terms of frequency. baritone. however. would never classify a singer in more than one voice type.  Lowest note in a solo: Guinness lists the lowest demanded note in the classical repertoire as D2 (almost two octaves below Middle C) in Osmin's second aria in Mozart's Die Entführung aus dem Serail. If the singer were more comfortable singing higher than the teacher would probably classify her as a soprano and if the singer were more comfortable singing lower than they would probably classify her as a mezzo-soprano. For example. Within each of these major categories there are several sub-categories that identify specific vocal qualities like coloratura facility and vocal weight to differentiate between voices.There are a plethora of different voice types used by vocal pedagogists today in a variety of voice classification systems. both written and unwritten. the soprano Mado Robin. famously heard in the Queen of the Night's two arias "Der Hölle Rache kocht in meinem Herzen" and "O zittre nicht.treble. sang a number of compositions created especially to exploit her highest notes. a female singer may have a vocal range that encompasses the high notes of a mezzo-soprano and the low notes of a soprano. While each voice type does have a general vocal range associated with it. human singing voices may possess vocal ranges that encompass more than one voice type or are in between the typical ranges of two voice types. or where the voice is most comfortable singing. lower notes are frequently heard. and bass. composed for Aloysia Weber. Although Osmin's note is the lowest 'demanded' in the operatic repertoire. and it is traditional for basses to interpolate a low C in the duet "Ich gehe doch rathe ich dir" in the same opera. and vocal timbre. Several little-known works call for pitches higher than G6.' a concert aria by W. voice teachers only use vocal range as one factor in classifying a singer's voice.For example. The teacher would also listen to the sound of the voice. World records and extremes of vocal range The following facts about female and male ranges are known:  Guinness lists the highest demanded note in the classical repertoire as G6 in 'Popoli di Tessaglia. Most of these types. Women are typically divided into three groups: soprano.[citation needed] The highest note commonly called for is F6.

blending. and any singer will tell you that vocal warm-ups make them feel more prepared. or P. breathing outwards until you've expelled as much air as possible from your lungs. To start warming up your range. Pick a note in the middle of your range (Middle C is reasonable) and begin humming. with light humming. (1 8 5 3 1). starting with a consonant like B. again starting an octave lower than middle C. to G1. sing a slightly more difficult phrase. diction. Muscles all over the body are used when singing (the diaphragm being one of the most obvious). Sundberg. Why Warm Up A study by Elliott. Jump first an octave. Repeat the exercise a half-step higher.back. Start just using a steady note. like o. Next. starting in the middle of your range. mull. but don't push too high. these activities teach breath control. Stretches of the abdomen. Eventually move to real notes. Some warm ups also train your voice. acting. the defects of . again starting from middle C. Finally. and continue up to the top of your range. to the bottom of your comfortablerange. After.D2) in a bass aria[not specific enough to verify] in the opera house version of Candide. Its concerns include the nature of speech and language pathology. not moving your shoulders up and down. & Gramming emphasized that changing pitch undoubtedly stretches the muscles. Take a deep breath in then make a hissing sound. then another third. and ah. such as Middle C. then making a "fire engine sound" go up and down. Others choose to sing a few words over and over to warm up. In Russian choirs the oktavists traditionally sing an octave below the bass part. Sometimes called vocalises. letting the voice fall in a glissando without much control. but stay in the middle range. This time. with an open vowel and a sibilant like Z. and shoulders are important to avoid stress. and balance. Use open vowels. sigh from the top of your range to the bottom. sing down a five note scale. my. which influences the sound of the voice. down to G1. then down a third. (That is a common sign of an untrained breather). Vocal warm up A vocal warm-up is a series of exercises which prepare the voice for singing. "Za a a a a" is reasonable. Move between notes. D. Pavel Chesnokov directs the bass soloist in "Do not deny me in my old age" to descend even lower.  Lowest note for a choir: Mahler's Eighth Symphony (bar 1457 in the "Chorus mysticus") and Rachmaninoff's Vespers require B 1. Start with simple exercises such as hissing. The phrase "I lo-ove to sing" fits with this exercise." Vocology Vocology is the science of enabling or endowing the human voice with greater ability or fitness. or other use. such as "Me. Do several of these. it is important to start breathing properly and from the diaphragm. working on getting really to the highest and lowest parts of your range. depending on the arrangement. Physical whole-body warm-ups also help prepare a singer. Range and Tone Start easy. use lip trills and tongue trills to help control your breathing as well. How To Warm Up Breathing Before you start to actually sing. then down a fourth. ay. ih. mo. Repeat several times and be sure when you're breathing in to breath using your diaphragm. Next. neck. repeat the exercise a half-step lower. sing an arpeggio of three thirds to an octave (1 3 5 1 5 3 1). In a Russian piece combining solo and choral singing.

sales people) this tiring can cause voice failures and sick leaves. Meaning and Origin of term Vocology was invented (simultaneously. invasive measurement of movement. and an otolaryngologist at Washington University. However. Also reflecting this increased recognition is that when the Scandinavian journal of logopedics & phoniatrics and Voice merged in 1996 the new name selected was Logopedics. placing objects in the pharynx usually triggers a gag reflex that stops voicing and closes the larynx. He goes on that this "is more than repairing a voice or bringing it back to a former state . Prof. the process of speaking exerts a load on the vocal cords where the tissue will suffer from tiring. Westminster Choir College at Rider University.. at Milan's Azienda Ospedaliera Fatebenefratelli e Oftalmico. and high-speed videos provide an option but in order to see the vocal folds. but independently) by lngo R. in other words.phoniatrics. teachers. Voice analysis Voice analysis is the study of speech sounds for purposes other than linguistic content. and the laryngeal musclature is intensely active during speech or singing and is subject to tiring. Among professional voice users (i. Luke's-Roosevelt Hospital Center and the Regional Center for Voice and Swallowing. Analysis methods Voice problems that require voice analysis most commonly originate from the vocal folds or the laryngeal musculature that controls them. such as in speech recognition. George Gates. Less invasive imaging methods such as xrays or ultrasounds do not work because the vocal cords are surrounded by cartilage which distort image quality. In addition. but this requires extensive training and is still always subjective.e. with a strong emphasis on habilitation". to ³capacitate´. but also speaker identification. a fiberoptic probe leading to the camera has to be positioned in the throat. The vocal cords of a person speaking for an extended period of time will suffer from tiring. to assist in performing whatever function that needs to be performed". To evaluate these problems vocal loading needs to be objectively measured. Titze defines Vocology as "the science and practice of voice habilitation. which makes speaking difficult. More controversially. Movements in the vocal cords are rapid. it is the process of strengthening and equipping the voice to meet very specific and special demands". rather. In order to objectively evaluate the improvement in voice quality there has to be some measure of voice quality. Such studies include mostly medical analysis of the voice i. stroboscopic imaging is only useful when the vocal fold vibratory pattern is closely periodic. for instance. the remediation of speech therapy and the voice training and voice pedagogy of song and speech for actors and public speakers. some believe that the truthfulness or emotional state of speakers can be determined using Voice Stress Analysis orLayered Voice Analysis. . To habilitate means to ³enable´. The location of the vocal folds effectively prohibits direct.the vocal tract (laryngology). An experienced voice therapist can quite reliably evaluate the voice. The study of vocology is recognized academically in taught courses and institutes such as the National Center for Voice and Speech. Stroboscopic.. fundamental frequencies are usually between 80 and 300 Hz. Vocology. the Vox Humana Laboratory at St. The Grabscheid Voice Center at Mount Sinai Medical Center. thus preventing usage of ordinary video. Phoniatrics. In addition. Typical voice problems A medical study of the voice can be. Titze.e. since the folds are subject to collision forces with each vibratory cycle and to drying from the air being forced through the small gap between them. analysis of the voice of patients who have had a polyp removed from his or her vocal cords through an operation. to ³equip for´. that is. dynamic analysis of the vocal folds and their movement is physically difficult. Another active research topic in medical voice analysis is vocal loading evaluation.

which in turn reflect the movements of the vocal folds. It thus yields one-dimensional information of the contact area. Fundamental frequency The voiced speech of a typical adult male will have a fundamental frequency from 85 to 180 Hz. the fundamental frequency of most speech falls below the bottom of the "voice frequency" band as defined above. the free encyclopedia (Redirected from Voice disorders) Voice disorders are medical conditions affecting the production of speech. The bandwidth allocated for a single voice-frequency transmission channel is usually 4 kHz. This method produces an estimate of the waveform of the glottal airflow pulses. the speech sound (the radiated acoustic pressure waveform. Neither inverse filtering nor EGG are sufficient to completely describe the complex 3-dimensional pattern of vocal fold movement. not acoustic energy). . In telephony. The other kind of noninvasive indirect indication of vocal fold motion is the electroglottography. as obtained from a microphone) or the oral airflow waveform from a circumferentially vented (CV) mask is recorded outside the mouth and then filtered by a mathematical method to remove the effects of the vocal tract. within part of the audio range. allowing a sampling rate of 8 kHz to be used as the basis of the pulse code modulation system used for the digital PSTN. and that of a typical adult female from 165 to 255 Hz. Thus. that is used for the transmission of speech. However. In inverse filtering. in which electrodes placed on either side of the subject's throat at the level of the vocal folds record the changes in the conductivity of the throat according to how large a portion of the vocal folds are touching each other. the usable voice frequency band ranges from approximately 300 Hz to 3400 Hz. These include:  Chorditis  Vocal fold nodules  Vocal fold cysts  Vocal cord paresis  Reinke's Edema  Spasmodic dysphonia  Foreign accent syndrome  Bogart-Bacall Syndrome  Laryngeal papillomatosis  Puberphonia Voice frequency From Wikipedia. It is for this reason that the ultra low frequency band of the electromagnetic spectrumbetween 300 and 3000 Hz is also referred to as voice frequency (despite the fact that this is electromagnetic energy. the free encyclopedia A voice frequency (VF) or voice band is one of the frequencies.The most important indirect methods are currently inverse filtering of either microphone or oral airflow recordings and electroglottography (EGG). including guard bands. List of voice disorders From Wikipedia. enough of the harmonic series will be present for the missing fundamental to create the impression of hearing the fundamental tone. but can provide useful indirect evidence of that movement.

creating fluctuations in air pressure that are known as sound waves. tongue. This includes the lips. by definition. distinguished by the buzzing sound of this periodic oscillation of the vocal cords. larynx. inflated but not tied off and stretched tightly across the neck produces a squeak or buzz. uvula. jaw. Resonances in the vocal tract modify these waves according to the position and shape of the lips. depending on the tension across the neck and the level of pressure inside the balloon. creating formant regions and thus different qualities of sonorant (voiced) sound. In singing. the vibration frequency of the vocal folds determines the pitch of the sound produced. occur when the vocal cords are contracted or relaxed across the larynx. teeth. with similar results. The lips of the mouth can be used in a similar way to create a similar sound. Mouth and nose openings radiate the sound waves into the environment. This process continues in a periodic cycle that is felt as a vibration (buzzing). larynx (voice box) and pharynx(back of the throat). The larynx The larynx or voice box is a cylindrical framework of cartilage that serves to anchor the vocal folds. It comprises the larynx and the vocal tract. The voice organ is the part of the human body responsible for the generation of sound. The vocal folds in the larynx vibrate. Overview The human voice produces sounds in the following manner: 1. and other speech organs. Voiced phonemes such as the pure vowels are. Similar actions. soft palate. hard and soft palates. 4. 2. usually in the form of speech or singing. tongue. as any toddler or trumpeter can demonstrate. etc. Vocal apparatus or vocal organs is a term used in phonetics to designate all parts of human anatomy that can be used to produce speech. the airflow from the lungs is impeded until the vocal folds are forced apart again by the increasing air pressure from the lungs. When the muscles of the vocal folds contract. 3. Air pressure from the lungs creates a steady flow of air through the trachea (windpipe).Vocal apparatus The human head and neck (internal). lungs. A rubber balloon. The vocal tract .

Production of vowels A vowel is any phoneme in which airflow is impeded only or mostly by the voicing action of the vocal cords. Formants Formants are the resonant frequencies of the vocal tract that emphasize particular voice harmonics near in frequency to the resonance. nor can the various timbres of different vowel sounds be produced: without the vocal tract. the free encyclopedia (Redirected from Voice pedagogy) Vocal pedagogy is the study of the art and science of voice instruction. how singing works. and even nasal cavities. or turbulent non-periodic energy (i. The anatomy of the Vocal folds is an important topic the field of Vocal Pedagogy. since a strictly unvoiced whisper is still quite intelligible. noise) near the formant frequency in the case of whispered speech.The sound source from the larynx is not sufficiently loud to be heard as speech. Vocal pedagogy From Wikipedia. Laryngoscopic view of the vocal folds. Our interest is therefore most focused on further modulations of and additions to the fundamental tone by other parts of the vocal apparatus.e. not a necessity. The well-defined fundamental frequency provided by the vocal cords in voiced phonemes is only a convenience. pharyngeal. The formants tell a listener what vowel is being spoken. determined by the variable dimensions of oral. only a buzzing sound would be heard. Abduction and adduction Latin Gray's plica vocalis subject #236 1079 . however. It is utilized in the teaching of singing and assists in defining what singing is. and how proper singing technique is accomplished.

such as range extension. Highly influential in the development of a vocal pedagogical system were monks Johannes de Garlandia and Jerome of Moravia who were the first to develop a concept of vocal registers. It is unclear. and capitis). guttoris.MeSH Vocal+Cords Vocal pedagogy covers a broad range of aspects of singing. History Pythagoras. throat voice . or Art song  Phonetics  Voice classification All of these different concepts are a part of developing proper vocal technique. the monasteries were the center of musical intellectual life during the medieval period and many men within the monasteries devoted their time to the study of music and the art of singing. Not all vocal teachers have the same opinions within every topic of study which causes variations in pedagogical approaches and vocal technique. The first surviving record of a systematized approach to teaching singing was developed in the medieval monasteries of the Roman Catholic Church sometime near the beginning of the 13th century. is much more similar to the modern pedagogists . teaching music. however. however. Typical areas of study include:  Human anatomy and physiology as it relates to the physical process of singing. such as learning to sing opera. Scholars such as Alypius and Pythagoras studied and made observations on the art of singing. Their concept of head voice. belt. vowels and articulation  Vocal registration  Sostenuto and legato for singing  Other singing elements. and head voice (pectoris . These men identified three registers: chest voice. ranging from the physiological process of vocal production to the artistic aspects of interpretation of songs from different genres or historical eras. coloratura  Vocal health and voice disorders related to singing  Vocal styles. As with other fields of study. vibrato. the man in the center with the book.  Breathing and air support for singing  Posture for singing  Phonation  Vocal resonation or voice projection  Diction. whether the Greeks ever developed a systematic approach to teaching singing as little writing on the subject survives today. the study of vocal pedagogy began in Ancient Greece. in The School of Athens by Raphael Within Western culture. tone quality.

The courts of rich partons. The church also remained at the forefront of musical composition at this time and remained highly influential in shaping musical tastes and practices both in and outside the church. The ideas developed within the monastic system highly influenced the development of vocal pedagogy over the next several centuries including the Bel Canto style of singing. In the late 17th century. Giulio Caccini is an example of an important early Italian voice teacher. such as the Dukes of Burgundy who supported the Burgundian School and the Franco-Flemish School. It was not until the development of opera in the 17th century that vocal pedagogy began to break away from some of the established thinking of the monastic writers and develop deeper understandings of the physical process of singing and its relation to key concepts like vocal registration and vocal resonation. Other concepts discussed in the monastic system included vocal resonance. were based on the concepts developed within the monastic system. which ultimately led to the popularity of castrato voices in Baroque and Classical operas. This style of singing had a huge impact on the development of opera and the development of vocal pedagogy during the Classical and Romantic periods. Manuel Patricio Rodríguez García is often considered one of the most important voice teachers of the 19th century. It was during this time. It was the Catholic Church that first popularized the use of castrato singers in the 16th century. However. It was also during this time. Many of the teachers within these schools had their initial musical training from singing in church choirs as children.understanding of the falsetto register. Within these systems. ed with the development of the laryngoscope and the beginning of modern voice pedagogy. diction. however. more descriptive terms were used in classifying voices such as coloratura soprano and lyric soprano. the study of singing began to move outside of the church. and tone quality to name a few. late 19th century Voice teachers in the 19th century continued to train singers for careers in opera. breath support. that teachers and composers first began to identify singers by and write roles for more specific voice types. Examining the vocal mechanism with a laryngoscope. became secular centers of study for singing and all other areas of musical study. that noted voice teachers began to emerge. the bel canto method of singing began to develop in Italy. and is cr . it wasn't until the 19th century that more clearly defined voice classification systems like the German Fach system emerged. The vocal pedagogical methods taught in these schools. voice classification. . With the onset of the Renaissance in the 15th century.

creating a hybrid of the two. This shift in approach to the study of singing led to the rejection of many of the assertions of the bel canto singing method. there are currently two predominating schools of thought among voice teachers today. In addition. and William Vennard at the University of Southern California. Appelman and Vennard were also part of a group of voice instructors who developed courses of study for beginning voice teachers. There are also those teachers who borrow ideas from both perspectives. the creation of organisations such as theNational Association of Teachers of Singing (now an international organization of Vocal Instructors) has enabled voice teachers to establish more of a consensus about their work. and scientific and practical aspects of voice pedagogy continue to be studied and discussed by professionals. especially Ralph Appelman at Indiana University. and by 1980 the subject of voice pedagogy was beginning to be included in many college music degree programs for singers and vocal music educators. Topics of study Pedagogical philosophy There are basically three major approaches to vocal pedagogy. As a result. those who maintain the historical positions of the bel canto method and those who choose to embrace more contemporary understandings based in current knowledge of human anatomy and physiology. and has expanded the understanding of what singing teachers do. Some voice instructors advocate an extreme mechanistic approach that believes that singing is largely a . most particularly in the areas of vocal registration and vocal resonation. and physiology of singing. The field of voice pedagogy became more fully developed in the middle of the 20th century. A few American voice teachers began to study the science. anatomy. all related to how the mechanistic and psychological controls are employed within the act of singing. Oren Brown at the Washington University School of Medicine and later the Juilliard School. adding these scientific ideas to the standard exercises and empirical ways to improve vocal technique. More recent works by authors such as Richard Miller and Johan Sundberg have increased the general knowledge of voice teachers.Mathilde Marchesi was both an important singer and teacher of singing at the turn of the 20th century.

On the other extreme. and articulation. The vocal resonators receive the sound and influence it  4. is the school of thought that believes that attention should never be directed to any part of the vocal mechanism--that singing is a matter of producing the right mental images of the desired tone. however. believe that the truth lies somewhere in between the two extremes and adopt a composite of those two approaches. phonation. The nature of vocal sounds Physiology of vocal sound production There are four physical processes involved in producing vocal sound: respiration. and that correcting vocal faults is achieved by learning to think the right thoughts and by releasing the emotions through interpretation of the music. Sound is initiated in the larynx  3. Many vocal problems result from a lack of coordination within this process. Breath is taken  2. and that correcting vocal faults is accomplished by calling direct attention to the parts which are not working well. The articulators shape the sound into recognizable units Although these four processes are to be considered separately. Respiration A labeled anatomical diagram of the vocal folds or cords. one should rarely be reminded of the process involved as their mind and body are so coordinated that one only perceives the resulting unified function. With an effective singer or speaker. . in actual practice they merge into one coordinated function. These processes occur in the following sequence:  1.matter of getting the right physical parts in the right places at the right time. Most voice teachers. resonation.

enrichment. although in strictly scientific usage acoustic authorities would question most of them. Articulation . and prolongation. the tracheal tree. It takes place in the larynx when the vocal folds are brought together and breath pressure is applied to them in such a way that vibration ensues causing an audible source of acoustic energy. the larynx itself. the pharynx. Resonation Vocal resonation is the process by which the basic product of phonation is enhanced in timbre and/or intensity by the air-filled cavities through which it passes on its way to the outside air. these areas are the chest. improvement. the oral cavity. Various terms related to the resonation process include amplification. which pull the arytenoid cartilages together. and the sinuses. respiration is the process of moving air in and out of the body-inhalation and exhalation. or should be. There are seven areas that may be listed as possible vocal resonators. In sequence from the lowest within the body to the highest. i. The vocal folds are brought together primarily by the action of the interarytenoid muscles. intensification. to make a better sound.e. enlargement. Phonation Phonation is the process of producing vocal sound by the vibration of the vocal folds that is in turn modified by the resonance of the vocal tract. which can then be modified by the articulatory actions of the rest of the vocal apparatus. Breathing for singing and speaking is a more controlled process than is the ordinary breathing used for sustaining life. sound. the nasal cavity. The controls applied to exhalation are particularly important in good vocal technique. The main point to be drawn from these terms by a singer or speaker is that the end result of resonation is..In its most basic sense.

Alveolar. 4. when the front of the tongue is used. Antero-dorsal. Pre-palatal. 6. the flexible front of the tongue ("coronal consonants"). In addition. There are many schools of thought within vocal pedagogy and different schools have adopted different terms.Places of articulation (passive & active): 1. Glottal. Post-alveolar. or the under surface ("sub-apical consonants"). Exo-labial. Problems in describing vocal sounds Describing vocal sound is an inexact science largely because the human voice is a selfcontained instrument. 9. Velar. vocal instructors tend to focus more on active articulation as opposed to passive articulation. the middle/back of the tongue ("dorsal consonants"). There are five basic active articulators: the lip ("labial consonants"). Classification of vocal sounds Vocal sounds are divided into two basic categories-vowels and consonants-with a wide variety of sub-classifications. These adjustments and movements of the articulators result in verbal communication and thus form the essential difference between the human voice and other musical instruments. The places linguolabial and interdental. 5. the root of the tongue together with the epiglottis ("radical consonants"). sometimes from other artistic disciplines. alveolar and palatal. 15. 16. dental and alveolar. voice teachers often focus less on how it "sounds" and more on how it "feels". Students who lack a natural creative imagination and aesthetic sensibility can not learn it from someone else. Another problem in describing vocal sound lies in the vocal vocabulary itself. passive articulation is a continuum without many clear-cut boundaries. 2. These articulators can act independently of each other. Apical. Unlike active articulation. Vibratory sensations resulting from the closely-related processes of phonation and resonation. Postero-dorsal. Interpretation Interpretation is sometimes listed by voice teachers as a fifth physical process even though strictly speaking it is not a physical process. 12. Since the vocal instrument is internal. Failure to interpret well is not a vocal fault even though it may affect vocal sound significantly. 13. the tip of the tongue ("apical consonants"). velar and uvular merge into one another. body position. movement. Dental. and weight serve as a guide to the singer on correct vocal production. 17. the singer's ability to monitor the sound produced is complicated by the vibrations carried to the ear through the Eustachean (auditory) tube and the bony structures of the head and neck. and two or more may work together in what is calledcoarticulation. Palatal. In relation to the physical process of singing. and studying the problems that certain consonants or vowels may cause while singing. 10. 3. 18. most voice teachers agree that interpretation can not be taught. interdental and dental. Radical. The reason for this is that interpretation does influence the kind of sound a singer makes which is ultimately achieved through a physical action the singer is doing. and kinesthetic ones arising from muscle tension. 14. In other words. Singing without understandable words limits the voice to nonverbal communication. The International Phonetic Alphabet is used frequently by voice teachers and their students. 7. This has led to the use of a plethora of descriptive terms applied to the voice which are not always understood to mean the same . it may be the upper surface or blade of the tongue that makes contact ("laminal consonants"). and the larynx ("laryngeal consonants"). Laminal. Sub-apical Articulation is the process by which the joint product of the vibrator and the resonators is shaped into recognizable speech sounds through the muscular adjustments and movements of the speech organs. and a consonant may be pronounced somewhere between the named places. Although teachers may acquaint their students with musical styles and performance practices and suggest certain interpretive effects. 11. Uvular. Endo-labial. palatal and velar. These articulations also merge into one another without clear boundaries. As a result. Epiglottal. most singers hear something different in their ears/head than what a person listening to them hears. Voice Teachers and serious voice students spend a great deal of time studying how the voice forms vowels and consonants. 8. Pharyngeal.

Voice classification is often used withinopera to associate possible roles with potential voices. and a resting or recovery period. these stages are not usually consciously controlled. light. p. Posture The singing process functions best when certain physical conditions of the body exist. . speech level. and vocal properties Female voices of singers.thing. ringing. Other Countertenor considerations are physical characteristics. and Singing was with Breath. dark. reedy. and vocal Male voices transition points such as breaks and lifts within the voice. vocal weight. and Singing was Breath. and without Breath was not any Singing made that was made. The ability to move air in and out of the body freely and to obtain the needed quantity of air can be seriously affected by the posture of the various parts of the breathing mechanism. A sunken chest position will limit the capacity of the lungs. who paraphrased a "Credo" for singing (no blasphemy intended): In the Beginning there was Breath. vocal tessitura. White. hooty. a setting up controls period (suspension)  3. round. Breathing and breath support In the words of Robert C. forward. Good posture also makes it easier to initiate phonation and to tune the resonators as proper alignment prevents unnecessary tension in the body. scientific Tenor testing. white. Voice classification is the process by which human Soprano singing voices are evaluated and are thereby designated into voice Mezzo-soprano types. a breathing-in period (inhalation)  2. and vocal registration. There are currently several different systems in use within classical music including: the German Fachsystem and the choral music system among many others. a breathing out period. Natural breathing has three stages: a breathing-in period. however the singer must have control of the intake and exhalation of breath to achieve maximum results from their voice. Within singing there are four stages of breathing:  1. Some terms sometimes used to describe a quality of a voice's sound are: warm. pear-shaped. bleaty. a controlled exhalation period (phonation)  4. Habitual good posture also ultimately improves the overall health of the body by enabling better blood circulation and preventing fatigue and stress on the body. vocal timbre. The science behind voice Baritone classification developed within European classical music and has Bass been slow in adapting to more modern forms of singing. talents. covered. mellow. swallowed. Many singers abandon conscious controls before their reflexes are fully conditioned which ultimately leads to chronic vocal problems. Audiences also tend to respond better to singers with good posture. 26) All singing begins with breath. Good posture allows the breathing mechanism to fulfill its basic function efficiently without any undue expenditure of energy. and Singing was Breath. and a tense abdominal wall will inhibit the downward travel of the diaphragm. Voice Instructors have also noted that when singers assume good posture it often provides them with a greater sense of self assurance and poise while performing. spread. And all singing was made by the Breath. and so forth. focused. a recovery period These stages must be under conscious control by the singer until they becomed conditioned reflexes. (White 1988. plummy. These qualities include but are not limited to: vocal Contralto range. Breathing in everyday life is a subconscious bodily function which occurs naturally. No system is universally applied or accepted. voices are treated Voice type like musical instruments. Composers who write vocal music must have an understanding of the skills. All vocal sounds are created by vibrations in the larynx caused by air from the lungs. Voice classification In European classical music and opera.

However. So many premature diagnoses have been proved wrong. tenor. singers voices are divided solely on the basis of vocal range. Since contemporary musicians use different vocal techniques. Since most people have medium voices. pop. microphones. The reason for this is that the majority of individuals possess medium voices and therefore this approach is less likely to misclassify or damage the voice. tenor. It is best to begin in the middle part of the voice and work upward and downward until the voice classifies itself. resonation. and bass. Vocal registration Vocal registers Highest Whistle Falsetto Modal Vocal Lowest fry Vocal registration refers to the system of vocal registers within the human voice. Within contemporary forms of music (sometimes referred to as Contemporary Commercial Music). blues. Women are typically divided into three groups: soprano. the mezzo-soprano must sing soprano or alto and the baritone must sing tenor or bass. Men are usually divided into four groups: countertenor. A register in the human voice is a particular series of tones. etc. can be misleading or even inaccurate. and are not forced to fit into a specific vocal role. mezzo-soprano. Registers originate . Attempts have been made to adopt classical voice type terms to other forms of singing but such attempts have been met with controversy. Dangers of quick identification Many voice teachers warn of the dangers of quick identification. and it may be adjusted as the voice continues to develop. As a result. Many acclaimed voice instructors suggest that teachers begin by assuming that a voice is of a medium classification until it proves otherwise. the true quality of the voice will emerge and the upper and lower limits of the range can be explored safely. baritone. Vennard says: "I never feel any urgency about classifying a beginning student. they must be assigned to a part that is either too high or too low for them." Most voice teachers believe that it is essential to establish good vocal habits within a limited and comfortable range before attempting to classify the voice. The development of voice categorizations were made with the understanding that the singer would be using classical vocal technique within a specified range using unamplified (no microphones) vocal production. soul. and rock styles. but for most singers there are fewer dangers in singing too low than in singing too high. singers are classified by the style of music they sing. country. When considering children's voices. the typical choral situation affords many opportunities for misclassification to occur. most classical music systems acknowledge seven different major voice categories. phonation. applying such terms as soprano. and contralto. breathing. There is currently no authoritative voice classification system within non-classical music. Premature concern with classification can result in misclassification. an eighth term.treble. It should be noted that within choral music. baritone. When techniques of posture. and possessing the same quality. produced in the same vibratory pattern of the vocal folds. and it can be harmful to the student and embarrassing to the teacher to keep striving for an ill-chosen goal. Choral music most commonly divides vocal parts into high and low voices within each sex (SATB). folk. and articulation have become established in this comfortable area. can be applied. such as jazz. Within each of these major categories there are several sub-categories that identify specific vocal qualities like coloratura facility and vocal weight to differentiate between voices. Only then can a tentative classification be arrived at. Either option can present problems for the singer. with all its attendant dangers.

his laryngeal function tends to become static and eventually breaks occur with obvious changes of tone quality. the vocal folds are lengthened. In other words.  A phonatory process  A certain vocal timbre  A region of the voice which is defined or delimited by vocal breaks. This view is also adopted by many teachers of singing. and how many registers there are. the vocal folds affect breath control. However. These breaks are often identified as register boundaries or as transition areas between registers. some areas of the art of singing are so much the result of coordinated functions that it is hard to discuss them under a traditional heading like phonation. the resonators affect the vocal folds. The term register can be somewhat confusing as it encompasses several aspects of the human voice. articulation. Teachers who like to use this theory of "blending registers" usually help students through the "passage" from one register to another by hiding their "lift" (where the voice changes). and the whistle register. For example. phonation only comes into perspective when it is connected with respiration. . the articulators affect resonance. vowel phonation into a single phonological system. Within speech pathology the term vocal register has three constituent elements: a certain vibratory pattern of the vocal folds. Once the voice student has become aware of the physical processes that make up the act of singing and of how those processes function. students and teachers. Speech pathologists identify four vocal registers based on the physiology of laryngeal function: the vocal fry register. the falsetto register. Vocal problems are often a result of a breakdown in one part of this coordinated process which causes voice teachers to frequently focus in intensively on one area of the process with their student until that issue is resolved. and mass. As pitch rises. however. They occur because the vocal folds are capable of producing several different vibratory patterns. middle. all three of these factors are in a state of flux in the transition from the lowest to the highest tones. will become more concerned with one area of the technique than another. There are over a dozen different constructs of vocal registers in use within the field.  A subset of a language used for a particular purpose or in a particular social setting. and so forth. tension. The various processes may progress at different rates. the student begins the task of trying to coordinate them. many voice instructors disagree with this distinction of boundaries blaming such breaks on vocal problems which have been created by a static laryngeal adjustment that does not permit the necessary changes to take place. a register language is a language which combines tone and In linguistics. is due in part to what takes place in the modal register when a person sings from the lowest pitches of that register to the highest pitches. or respiration. The confusion which exists concerning what a register is. a certain series of pitches. Coordination Singing is an integrated and coordinated act and it is difficult to discuss any of the individual technical areas and processes without relating them to the others. Each of these vibratory patterns appears within a particular range of pitches and produces certain characteristic sounds. The term register can be used to refer to any of the following:  A particular part of the vocal range such as the upper. The frequency of vibration of the vocal folds is determined by their length. The distinct change or break between registers is called apassaggio or a ponticello. This difference of opinion has effected the different views on vocal registration.  A resonance area such as chest voice or head voice. organize registers differently. Registers can even overlap while singing. Vocal instructors teach that with study a singer can move effortlessly from one register to the other with ease and consistent tone. resonation. and their thickness decreases. the modal register. However. Inevitably. Some voice teachers. If a singer holds any of these factors constant and interferes with their progressive state of change.in laryngeal function. and a certain type of sound. tension increases. or lower registers.

and it refers to the dynamic level of the sound.Space refers to the amount of space created by the moving of the mouth and the position of the palate and larynx. Developing flexibility and agility  4. The Space Factor. music history. a singer's mouth should be opened wider the higher they sing. Extending the vocal range An important goal of vocal development is to learn to sing to the natural limits of one's vocal range without any obvious or distracting changes of quality or technique." General music studies Some voice teachers will spend time working with their students on general music knowledge and skills. breath support. rapid figurations. . Most voice teachers believe that the first step in coordinating these processes is by establishing good vocal habits in the most comfortable tessitura of the voice first before slowly expanding the range beyond that. The internal space or position of the soft palate and larynx can be widened by the relaxing of the throat. you must use more space. Individuals can develop their voices further through the careful and systematic practice of both songs and vocal exercises. There are three factors which significantly affect the ability to sing higher or lower: 1. Exercising the singing voice There are several purposes for vocal exercises. Extending the vocal range  3. you must use less. Achieving a balanced vibrato Developing the singing voice Singing is not a natural process but is a skill that requires highly developed muscle reflexes. (2) As you sing higher. you must use more energy. (3) As you sing higher. It refers to a dynamic relationship between the breathing-in muscles and the breathing-out muscles known as the breath support mechanism. The Depth Factor. including:  1. It refers to the total response of the body to the making of sound. Voice instructors teach that a singer can only achieve this goal when all of the physical processes involved in singing (such as laryngeal action. often adopting Solfege which assigns certain syllables to the notes of the scale. McKinney says.In this usage the word depth has two connotations. "These three factors can be expressed in three basic rules: (1) As you sing higher. Developing consistent vocal production with a consistent tone quality  3. Warming up the voice  2. It refers to the actual physical sensations of depth in the body and vocal mechanism and it refers to mental concepts of depth as related to tone quality. Singing does not require much muscle strength but it does require a high degree of muscle coordination. as you sing lower. "Lining up" the voice horizontally and vertically  4. The areas of vocal technique which seem to depend most strongly on the student's ability to coordinate various functions are. you must use less. and musical styles and practices as it relates to the vocal literature being studied. Singers should be thinking constantly about the kind of sound they are making and the kind of sensations they are feeling while they are singing. particularly music theory. If required they may also spend time helping their students become better sight readers. Extending the vocal range to its maximum potential  2. and articulatory movement) are effectively working together. The Energy Factor. you must use more depth. learning to comfortably sing wide intervals. as you sing lower. control of dynamics. Acquiring vocal techniques such as legato.  1. It also refers to the amount of breath pressure delivered to the vocal folds and their resistance that pressure. Voice teachers often describe this as feeling like the "beginning of a yawn".In this usage the word energy has several connotations. Generally speaking. staccato.with a resulting imbalance or lack of coordination. Voice Teachers instruct their students to exercise their voices in an intelligent manner. resonance adjustment. 3. you must use less. as you sing lower. and correcting vocal faults. 2.

and it is recommended to stand up straight with your feet shoulder width apart and your upstage foot (right foot if right-handed etc) slightly forward. and the use of equipment such as microphones. In singing voice projection is often equated with resonance. The size. the concentrated pressure through which one produces a focused sound. The science behind voice Baritone classification developed within European classical music and is not Bass generally applicable to other forms of singing. well-balanced respiration is especially important to maintaining vocal projection. There have been times when voice classification systems have been used too rigidly. as an actor in a theatre. so that they are unimpaired by tension. This includes teaching their students etiquette of behavior on stage such as bowing. There are currently several different systems in use including: the German Fach system and the choral music system among many others. such as when a teacher is talking to the class. The goal is to isolate and relax the muscles controlling the vocal folds. Whereas in normal talking one may use air from the top of the lungs. vocal tessitura. No system is universally applied or accepted. Stance is also important. Other Countertenor considerations are physical characteristics. venues. whilst the counterplay between the diaphragm and abdominal muscles is trained to control airflow. vocal Contralto weight. addressing problems like stage fright or nervous tics. a properly projected voice uses air properly flowing from the expansion of the diaphragm. and listeners to categorize vocal properties. and vocal registration. This improves your balance and your breathing. Breath technique is essential for proper voice projection. and vocal transition Male voices points such as breaks and lifts within the voice. and to associate possible roles with potential voices. It is a technique which can be employed to demand respect and attention. Some students may also be preparing for careers in the fields ofopera or musical theater where acting skills are required. Voice classification is a tool for singers. scientific Tenor testing. shape. Voice classification is often used within opera to associate possible roles with potential voices.e. As the sound being produced and these resonators find the same overtones. Students of opera also spend a great deal of time with their voice teachers learning foreign language pronunciations. vocal timbre.Performance skills and practices Since singing is a performing art. a house assigning a singer . True resonance will produce the greatest amount of projection available to a voice by utilizing all the key resonators found in the vocal cavity. voice teachers spend some of their time preparing their students for performance. i. This article focuses on voice classification within classical music. the sound will begin to spin as it reaches the ideal singer's formant at about 2800 Hz. These Mezzo-soprano qualities include but are not limited to: vocal range. composers. or simply to be heard clearly. For other contemporary styles of singing see: Voice classification in non-classical music. The external intercostal muscles are used only to enlarge the chest cavity. speech level. In good vocal technique. Voice type Voice type A voice type is a particular kind of human singing voice perceived as having certain identifying qualities or characteristics. and hardness of the resonators all factor into the production of these overtones and ultimately determine the projective capacities of the voice. Many voice instructors will spend time on acting techniques and audience communication with students in these fields of interest. Voice Female voices classification is the process by which human voices are Soprano evaluated and are thereby designated into voice types. Voice projection Voice projection is the strength of speaking or singing whereby the voice is used loudly and clearly.

These sub-categories include: Coloratura soprano. baritone. the weight of voice. At the highest extreme. Joan Sutherland. tessitura: Choral singers are classified into voice parts based on range. Soubrette. and only casting him or her in roles they consider belonging to this category. Men are usually divided into four groups: countertenor. treble. In particular. some singers such as Shirley Verrett or Grace Bumbrychange type. Women are typically divided into three groups: soprano. a Falcon a darker-colored soprano drammatico.to a specific type. vocal color or timbre. and contralto. Ewa Podles. A note on vocal range vs. Female voices The range specifications given below are based on the American scientific pitch notation. which are intermediate voice types between the soprano and the mezzo soprano: a Dugazon is a darker-colored soubrette. The typical range of this voice is between A3 (the A below middle C) to A5 (the A two octaves above A3). Soprano Soprano range: The soprano is the highest female voice. Some roles as well are hard to classify. Mezzo-soprano The mezzo-soprano is the middle-range voice type for females and is the most common female voice. and some of Verdi¶s early works make extreme demands on his singers. and bass. over-lapping both of them.) Number of voice types There are a plethora of different voice types used by vocal pedagogists today in a variety of voice classification systems. the coloratura soprano has the highest tessitura of all the soprano subtypes. Within each of these major categories there are several sub-categories that identify specific vocal qualities like coloratura facility and vocal weight to differentiate between voices. Soprano tessitura: The tessitura of the soprano voice lies higher than all the other female voices. Intermediate voice types Two types of soprano especially dear to the French are the Dugazon and the Falcon. solo singers are classified into voice types based in part on tessitura ± where the voice feels most comfortable for the majority of the time.mezzo-soprano. Some singers such as Enrico Caruso. Mezzo-soprano range: The mezzo-soprano voice lies between the soprano voice and contralto voice. having very unusual vocal requirements. can be applied. When considering the pre-pubescent male voice an eighth term. Spinto. are sub-types that fall under seven different major voice categories that are for the most part acknowledged across all of the major voice classification systems. (For more information and roles and singers. however. Maria Callas. Soprano sub-types: As with all voice categories. or Plácido Domingo have voices which allow them to sing roles from a wide variety of types. see the individual voice type pages. Rosa Ponselle. sopranos are often divided into different sub-categories based on range. Most of these types. and some singers such as Leonie Rysanek have voices which lower with age. In the lower and upper extremes. . Lyric soprano. Most soprano roles do not extend above "high C" although there are several standard soprano roles that call for D6 or D-flat6. The low extreme for sopranos is roughly B3 or A3 (just below middle C). A singer will ultimately choose a repertoire that suits their instrument. causing them to cycle through types over their careers. some coloratura soprano roles may reach from F6 to A6 (the F to A above "high C"). The typical soprano voice lies between middle C (C4) and "high C"(C6). Mozart wrote many of his roles for specific singers who often had remarkable voices. and even voice part over their careers. and Dramatic soprano. tenor. some mezzo-sopranos may extend down to the G below middle C (G3) and as high as "high C" (C6). and dexterity of the voice.

Alto 2. the equivalent of the French haute-contre. Tenor Tenor range: The tenor is the highest male voice within the modal register. Dramatic tenor.Mezzo-soprano tessitura: Although this voice overlaps both the contralto and soprano voices. the weight of the voice. . The low extreme for tenors is roughly B-flat 2 (the second b-flat below middle C). and something similar to the "leggiero tenor" or tenor altino. some contralto voices can sing from the E below middle C (E3) to the second b-flat above (b-flat5). which is only one whole step short of the "Soprano C". singers called countertenors generally sing in the falsetto register. It should be remembered that. At the highest extreme. Contralto sub-types: Contraltos are often broken down into two categories: Lyric contralto and Dramatic contralto. Haute-contre.Tenor tessitura: The tessitura of the tenor voice lies above the baritone voice and below the countertenor voice. all male voices used some falsetto-type voice production in their upper range. vocal color or timbre. Contralto tessitura: The contralto voice has the lowest tessitura of the female voices. until about 1830. However. but some as high as E6 or F6 Haute-contre: from about D3 or E3 to about D5 Countertenor sub-types: There are several sub-types of countertenors including Sopranist or male soprano. female singers with very low vocal tessituras are often included amongmezzo-sopranos. needed] : Countertenor ranges (approximate)[citation Countertenor: from about G3 to E5 or F5 Sopranist: extend the upper range to usually only C6. Lyric tenor. Historically. or singers with a disorder such as Kallmann syndrome). In current operatic practice. In the lower and upper extremes. A true operatic contralto is extremely rare. in England at least. and modern castrato. is usually more suitable to a contralto voice than a mezzo-soprano voice. also designated a very high tenor voice. sometimes using their modal register for the lowest notes. Contralto Contralto range: The contralto voice is the lowest female voice. some tenors can sing up to the second F above "Middle C" (F5). Tenor sub-types: Tenors are often divided into different sub-categories based on range. The Leggiero tenor has the highest tessitura of all the tenor sub-types. Except for a few very rare voices (such as the American male soprano Michael Maniaci. Male voices Countertenor The term countertenor refers to the highest male voice. there is much evidence that "countertenor". and Heldentenor. in many compositions the alto line is split into two parts.. "alto" is not a voice type but a designated vocal line in choral music based on vocal range. so much so that often roles intended for contraltos are performed by mezzo-sopranos as this voice type is difficult to find. The range of the alto part in choral music is usually more similar to that of a mezzo-soprano than a contralto. The typical contralto range lies between the F below middle C (F3) to the second F (F5) above middle C. These subcategories include: Leggiero tenor. The lower part. The typical tenor voice lies between the C one octave below middle C (C3) to the C one octave above "Middle C" (C5). Mezzo-soprano sub-types: Mezzo-sopranos are often broken down into three categories: Lyric mezzo-soprano. Many countertenor singers perform roles originally written for castrati in baroque operas. Spinto tenor. and dexterity of the voice. Technically. Alto Contralto and alto are not the same term. Coloratura mezzo-soprano and Dramatic mezzosoprano. the tessitura of the mezzo-soprano is lower than that of the soprano and higher than that of the contralto.

vocal color or timbre. some basses can sing from the C two octaves below middle C (C2) to the G above middle C (G4). and dexterity of the voice. Bass Bass range: The bass is the lowest male voice. The typical bass range lies between the second E below "middle C" (E2) to the E above middle C (E4). vocal color or timbre. Dramatic Bass. which is exactly two octaves. Bass tessitura: The bass voice has the lowest tessitura of all the voices. the human voice is in an in-between phase where it is not quite a child's voice nor an adult one yet. Baritone tessitura: Although this voice overlaps both the tenor and bass voices. These sub-categories include: Basso Profondo. but the Anglican church repertory. Verdi baritone. and dexterity of the voice. frequently demands G5 and even A5. the tessitura of the baritone is lower than that of the tenor and higher than that of the bass. both men and women's voices alter as the vocal ligaments become more defined and the laryngeal cartilages harden. the weight of the voice. which many trained trebles sing. Treble The term treble can refer to either a young female or young male singer with an unchanged voice in the soprano range. Dramatic baritone. Different singers will reach adult development earlier or later than others. The reason for this is that both groups have a similar laryngeal size and height and a similarvocal cord structure. This ability may be comparatively rare. the weight of the voice. Both boys and girls prior to puberty have an equivalent vocal range and timbre. kavalierbariton. Some trebles. Initially. From the onset of puberty to approximately age 22. a baritone's range can be extended at either end. baryton-noble. . however. The size and development of adult lungs also changes what the voice is physically capable of doing. Children's voices The voice from childhood to adulthood The human voice is in a constant state of change and development just as the whole body is in a state of constant change. and as stated above there are continual changes throughout adulthood as well. The typical baritone range is from the second F below middle C (F2) to the F above middle C (F4). the term was associated with boy sopranos but as the inclusion of girls into children's choirs became acceptable in the twentieth century the term has expanded to refer to all pre-pubescent voices. Baritone sub-types: Baritones are often divided into different sub-categories based on range. These subcategories include: Lyric baritone. Basso Cantante. In the lower and upper extremes of the bass voice. Basso Buffo / Bel Canto Bass. In the lower and upper extremes. The height of the male larynx becomes much longer than in women. A human voice will alter as a person gets older moving from immaturity to maturity to a peak period of prime singing and then ultimately into a declining period. The lumping of children's voices into one category is also practical as both boys and girls share a similar range and timbre. can extend their voices higher in the modal register to "high C" (C6). The vocal range and timbre of children's voices does not have the variety that adults' voices have. Bass sub-types: Basses are often divided into different sub-categories based on range. Treble range: Most trebles have an approximate range from the A below "middle C" (A3) to the F one and a half octaves above "middle C" (F5). Baritone range: The vocal range of the baritone lies between the bass and tenor ranges. and Bariton/Baryton-Martin. Many trebles are also able to reach higher notes by use of the whistle register but this practice is rarely called for in performance. With the onset of puberty. This is not to suggest that the voice stops changing at that age. overlapping both of them.Baritone The Baritone is the most common type of male voice. Bel Canto (coloratura) baritone. and Bass-baritone. The laryngealstructure of both voices change but more so in men.

breathing. Clinical evidence indicates that singing at a pitch level that is either too high or too low creates vocal pathology. tessitura. The reason for this is that the majority of individuals possess medium voices and therefore this approach is less likely to misclassify or damage the voice. and it can be harmful to the student and embarrassing to the teacher to keep striving for an ill-chosen goal. shorten a singing career and lead to the loss of both vocal beauty and free vocal production. and it may be adjusted as the voice continues to develop. the mezzo-soprano must sing soprano or alto and the baritone must sing tenor or bass. and vocal transition points." Singing at either extreme of the range may be damaging. Noted vocal pedagogist Margaret Greene says. Dangers of quick identification Many vocal pedagogists warn of the dangers of quick identification. Singing above an individual's best tessitura keeps the vocal cords under a great deal of unnecessary tension for long periods of time. teachers may also consider physical characteristics. So many premature diagnoses have been proved wrong. the human voice is quite resilient.Classifying singers Voice classification is important for vocal pedagogists and singers as a guiding tool for the development of the voice. As a result. Increasing tension on the vocal cords is one of the means of raising pitch. Many vocal pedagogists suggest that teachers begin by assuming that a voice is of a medium classification until it proves otherwise. and articulation have become established in this comfortable area. Since most people have medium voices. resonation. choral music divides voices solely on the basis of vocal range. even in trained voices. Misclassification can damage the vocal cords. they must be assigned to a part that is either too high or too low for them. A number of medical authorities have indicated that singing at too high a pitch level may contribute to certain vocal disorders. Choral music most commonly divides vocal parts into high and low voices within each sex (SATB). Some of these dangers are not immediate ones. William Vennardsays: "I never feel any urgency about classifying a beginning student. Premature concern with classification can result in misclassification. . especially in early adulthood. However. the typical choral situation affords many opportunities for misclassification to occur." Most vocal pedagogists believe that it is essential to establish good vocal habits within a limited and comfortable range before attempting to classify the voice. but for most singers there are fewer dangers in singing too low than in singing too high. In general vocal pedagogists consider four main qualities of a human voice when attempting to classify it: vocal range. the true quality of the voice will emerge and the upper and lower limits of the range can be explored safely. but the possibility of damage seems to be much more prevalent in too high a classification. Choral music classification Unlike other classification systems. and the possibility of vocal abuse is greatly increased. Medical evidence indicates that singing at too high of a pitch level may lead to the development of vocal nodules. Unfortunately. with all its attendant dangers. It is best to begin in the middle part of the voice and work upward and downward until the voice classifies itself. phonation. When techniques of posture. timbre. scientific testing and other factors. and the damage may not make its appearance for months or even years. Singing at too low a pitch level is not as likely to be damaging unless a singer tries to force the voice down. Singing outside the natural vocal range imposes a serious strain upon the voice. Only then can a tentative classification be arrived at. speech level. "The need for choosing the correct natural range of the voice is of great importance in singing since the outer ends of the singing range need very careful production and should not be overworked. Either option can present problems for the singer. this lack of apparent immediate harm can cause singers to develop bad habits that will over time cause irreparable damage to the voice.

Authoring Tool Accessibility Guidelines 1.Synthesizer technologies There are two main technologies used for the generating synthetic speech waveforms: concatenative synthesis and formant synthesis. It provides background to help understand how people with disabilities benefit from provisions described in the Web Content Accessibility Guidelines 1. 2. Speech synthesis . This document provides a general introduction to how people with different kinds of disabilities use the Web. Concatenative synthesis is based on the concatenation (or stringing together) of segments of recorded speech.Speech synthesis . 5. Given the Web's increasingly important role in society. this document describes: y scenarios of people with disabilities using accessibility features of Web sites and Web-based applications. and authoring tools. It is not a comprehensive or in-depth discussion of disabilities. 4. Generally. detracting from the naturalness Use Of The Web By People With Disabilities Abstract This document provides an introduction to use of the Web by people with disabilities. natural variation in speech and automated techniques for segmenting the waveforms sometimes result in audible glitches in the output. 8. nor of the assistive technologies used by people with disabilities. Introduction The Web Accessibility Initiative (WAI) develops guidelines for accessibility of Web sites.0. concatenative synthesis gives the most natural sounding synthesized speech. and User Agent Accessibility Guidelines 1. 3. However. Introduction Scenarios of People with Disabilities Using the Web Different Disabilities That Can Affect Web Accessibility Assistive Technologies and Adaptive Strategies Further Reading Scenario References General References Acknowledgements 1.Synthesizer technologies: Encyclopedia II . 6. 7. Specifically.Speech synthesis . Many of the accessibility solutions described in WAI materials also benefit Web users who do not have disabilities. It illustrates some of their requirements when using Web sites and Web-based applications. Table of Contents 1.0. access to the Web is vital for people with disabilities. in order to make it easier for people with disabilities to use the Web.0. browsers. .Concatenative synthesis. and provides supporting information for the guidelines and technical work of the World Wide Web Consortium's (W3C) Web Accessibility Initiative (WAI).

browsers. and to detailed curriculum examples or guideline checkpoints in the Scenarios References in Section 6. assistive technologies. The scenario references and general references sections also include links to external documents. Lee wants to buy some new clothes. Disability terminology varies from one country to another. Please note that the scenarios do not represent actual individuals. and music. and acronyms. visual. he is spending an evening shopping online. Scenarios of People with Disabilities Using the Web The following scenarios show people with different kinds of disabilities using assistive technologies and adaptive strategies to access the Web. seeking entertainment (user control of style sheets. abbreviations. This document contains many internal hypertext links between the sections on scenarios. In some cases the scenarios show how the Web can make some tasks easier for people with disabilities. stopping scrolling text. managing personal finances (magnification. avoiding pop-up windows) supermarket assistant with cognitive disability (clear and simple language.y y general requirements for Web access by people with physical. media players. and cognitive or neurological disabilities. 2. and scenario references. multiple search options) retiree with aging-related conditions. synchronization of visual. and braille display) classroom student with dyslexia (use of supplemental graphics. but rather individuals engaging in activities that are possible using today's Web technologies and assistive technologies. multiple search options) teenager with deaf-blindness. In some cases. Each scenario contains links to additional information on the specific disability or disabilities described in more detail in Section 3. access-key) online student who is deaf (captioned audio portions of multimedia files) accountant with blindness (appropriate markup of tables. or assistive technologies with specific features supporting accessibility may not yet be available in an individual's primary language. freezing animated graphics. As he frequently does. alternative text. to the assistive technology or adaptive strategy described in Section 4. hearing. He has one of the most common visual disabilities for . consistent design. disability requirements. consistent navigation options. device-independent access. speech. some types of assistive technologies and adaptive strategies used by some people with disabilities when accessing the Web. accessible multimedia. as do educational and employment opportunities. Following is a list of scenarios and accessibility solutions: y y y y y y y y online shopper with color blindness (user control of style sheets) reporter with repetitive stress injury (keyboard equivalents for mouse-driven commands. appropriate table markup) Online shopper with color blindness Mr. The reader should not assume that everyone with a similar disability to those portrayed will use the same assistive technologies or have the same level of expertise in using those technologies. labelled frames. appliances.

Lee discovered that on most newer sites the colors were controlled by style sheets and that he could turn these style sheets off with his browser or override them with his own style sheets. such as a sound card conflict that arises whenever he tries to use speech recognition on Web sites that have streaming audio. It took him several months to become sufficiently accustomed to using speech recognition to be comfortable working for many hours at a time. Mr. The sites did this by including names of the colors of clothing as well as showing a sample of the color. which in his case means an inability to distinguish between green and red. and this would re-damage his hands at this time.men: color blindness. Mr. Eventually Mr. Jones likes the access key feature that is implemented on some Web pages. and by placing an asterix (*) in front of the required fields in addition to indicated them by color. the required fields on forms were indicated by red text. To activate commands that do not have keyboard equivalents. Within a month. and it has become painful for him to type. Reporter with repetitive stress injury Mr. Lee found that he prefered sites that used sufficient color contrast. After additional experimentation. In some cases the site instructions explained that discounted prices were indicated by red text. he would have to use a mouse instead of speech recognition or typing. he discovered that several of his colleagues have switched to the new product as well. because the application that his office chose for a standard is missing many of the keyboard equivalents that he needs in place of mouse-driven commands. and not have to guess at which items were discounted. He uses a combination of speech recognition and an alternative keyboard to prepare his articles. and redundant information for color. but he doesn't use a mouse. after they found that the full keyboard support was easier on their own hands. but all of the text looked brown to him. In other cases. He researched some of the newer versions of authoring tools and selected one with full keyboard support. It enables him to shortcut a long list of links that he would ordinarily have to tab through by voice. it seemed to him the text and images on a lot of sites used poor color contrast. Mr. but again he could not tell which fields had red text. he has developed repetitive stress injury (RSI) in his hands and arms. Jones is a reporter who must submit his articles in HTML for publishing in an on-line journal. But on sites that did not use style sheets he couldn't override the colors. and instead go straight to the link he wants. When he first starting using the Web. Over his twenty-year career. When browsing other Web sites to research some of his articles. He has not been able to use the same Web authoring software as his colleagues. He has difficulty reading the text on many Web sites. Lee bookmarked a series of online shopping sites where he could get reliable information on product colors. Online student who is deaf . There are some things he has not worked out yet. since they appeared to use similar shades of brown. He realized that many sites were using colors that were indistinguishable to him because of his red/green color blindness.

Ms. She was able to point out that the University was clearly covered by a policy requiring accessibility of online instructional material. since the screen shows her colleagues the same part of the document that she is reading with speech or braille output. Martinez is taking several distance learning courses in physics. however for Web-based instruction they initially did not realize that accessibility was an issue.Ms. and has become accustomed to listening to speech output at a speed that her co-workers cannot understand at all. she quickly found that the Web-based chat format. ensured that she could keep up with class progress. since the tables on this company's documents are marked up clearly with column and row headers which her screen reader can access. once these resources were captioned with text. For an introductory multimedia piece. and the opportunity to provide Web-based text comments on classmates' work. then said they had no idea how to provide the material in accessible format. Laitinen is an accountant at an insurance company that uses Web-based formats over a corporate intranet. She has recently upgraded to a browser that allows better synchronization of the screen display with audio and braille rendering of that information. using an extensive collection of audio lectures. Much of the information on the Web documents used at her company is in tables. since braille enables her to read the language on a page more precisely. She uses refreshable braille output to check the exact wording of text. However. Laitenen must frequently help newer employees with their questions. This enables her to better help her colleagues. she finds the expansions of abbreviations and acronyms the first time they appear on a page allows her to better catch the meaning of the short versions of these terms. Her screen reader reads her the alternative text for any images on the site. Classroom student with dyslexia . which can sometimes be difficult for non-visual users to read. She is blind. For classroom-based lectures the university provided interpreters. The professor for the course also set up a chat area on the Web site where students could exchange ideas about their coursework.0 as a resource providing guidance on how to make Web sites. the university used a SMIL-based multimedia format enabling synchronized captioning of audio and description of video. Since the insurance codes she must frequently reference include a number of abbreviations and acronyms. She had little trouble with the curriculum until the university upgraded their on-line courseware to a multimedia approach. Although she was the only deaf student in the class and only one other student knew any sign language. She is deaf. She uses a screen reader to interpret what is displayed on the screen and generate a combination of speech output and refreshable braille output. She uses the speech output. The University had the lectures transcribed and made this information available through their Web site along with audio versions of the lectures. The school's information managers quickly found that it was much easier to comprehensively index the audio resources on the accessible area of the Web site. accessible. Accountant with blindness Ms. for rapid scanning of a document. she easily orients herself to the information in the tables. As one of the more senior members of the accounting staff. combined with tabbing through the navigation links on a page. and then to point to the Web Content Accessibility Guidelines 1. including those with multimedia.

But recently she tried text to speech software. Her school recently started to use more online curricula to supplement class textbooks. and developed a customized profile at some banking. and he finds it difficult to keep track of how much he is spending. and so it helps to be able to freeze the animated graphics. he found the scrolling stocktickers distracting. Her teacher has taught a number of different search strategies. and found that she was able to read along visually with the text much more easily when she could hear certain sections of it read to her with the speech synthesis. where the graphics are animated. and she finds that some sites provide options for a variety of searching strategies and she can more easily select searching options that work well for her. and has difficulty with abstract concepts. and a little short-term memory loss. Sometimes the search options are confusing for her. However with recent accommodations to the curriculum she has become enthusiastic about this class. Yunus uses the Web to manage some of his household services and finances. . He usually buys his own groceries at this supermarket. it is very hard for her to focus. and that do not auto-refresh. reading. He also tended to "get stuck" on some pages. hand tremor. In addition. when the icons and links on Web pages are bigger. Some of the pages have a lot of graphics. and the combination leads to substantial difficulty reading. Olsen attends middle school. managing personal finances Mr. she finds that some sites are much easier for her to use than others. and doing mathematical calculations. grocery. and particularly likes her literature class. and clothing sites. One of the most important things for her has been the level of accessibility of the Webbased online library catalogues and the general search functions on the Web. He has some central-field vision loss. though. When he first started using some of the financial pages. on some sites where new browser windows would pop open without notifying him. sometimes the pages would update before he had finished reading them. Her classes recent area of focus is Hans Christian Andersen's writings. and those help her focus in quickly on sections she wants to read. Yunus has gradually found some sites that work well for him. since she reads slowly. but sometimes finds that there are so many product choices that he becomes confused. Supermarket assistant with cognitive disability Mr. When she goes onto the Web. She has attention deficit disorder with dyslexia. and so he finds it easier to use pages with style sheets. He has Down syndrome. Sands has put groceries in bags for customers for the past year at a supermarket. Mr. instead of struggling over every word. finding that he could not back up. and they moved too fast for him to read. He has difficulty re-learning where his favorite products are each time the supermarket changes the layout of its products.Ms. In some cases. and she has to do some research about the author. Therefore he tends to use Web sites that do not have a lot of movement in the text. She was initially worried about reading load. Retiree with several aging-related conditions. it's easier for him to select them. He uses a screen magnifier to help with his vision and his hand tremor.

The Web site gives him an updated total each time he adds an item. The Web site for the local train schedule. Kaseem lives. At home. Once he decides what he wants to buy. or herportable braille device. helping him make sure that he does not overspend his budget. her screen reader with refreshable braille. The Web site for the bus schedule has frames without meaningful titles. Kaseem forwards the Web site address to friends and asks if they are interested in going with her to some of the restaurants featured on the tour. He explored the site the first few times with a friend. however. When screen magnification is not sufficient. She also checks the public transportation sites to find local train or bus stops near the restaurants. The marketing department of the online grocery wanted their Web site to have a high degree of usability in order to be competitive with other online stores. she also uses a screen reader to drive a refreshable braille display. Sands to use the site. Her preferences include having background patterns turned off so that there is enough contrast for her when she uses screen magnification. She uses a screen magnifier to enlarge the text on Web sites to a font size that she can read. so she often gets lost on the site when trying to find the information she needs. Mr. . which were helpful in navigating around the site. and descriptions of the video -. She has low vision and is deaf.including text subtitles for the audio. He can search by brand name or by pictures. which she reads slowly. which makes all Web pages display according to her preferences. He found that he could use the Web site without much difficulty -. A multimedia virtual tour of local entertainment options was recently added to the Web site of the city in which Ms. he visited an online grocery service from his computer at home. She uses a personal style sheet with her browser. seeking entertainment Ms.screen magnification. They usedconsistent design and consistent navigation options so that their customers could learn and remember their way around the Web site. is easy to use because the frames on that Web site have meaningful titles. and just buys a few fresh items each day at the supermarket where he works. Ms. and the schedules.it had a lot of pictures. but he mostly uses the option that lets him select from a list of products that he has ordered in the past. and in recognizing his favorite brands.Recently. His friend showed him different search options that were available on the site.which allows her to access it using a combination of screen magnification and braille. making it easier for him to find items. Ms. Teenager with deaf-blindness. he selects the item and puts it into his virtual shopping basket. While these features made the site more usable for all of the online-grocery's customers. This is especially helpful when she reads on-line sample menus of appealing restaurants. They also used the clearest and simplest language appropriate for the site's content so that their customers could quickly understand the material. The interface used for the virtual tour is accessible no matter what kind of assistive technology she is using -. Kaseem uses the Web to find new restaurants to go to with friends and classmates. Kaseem browses local Web sites for new and different restaurants. The tour is captioned and described -. they made it possible for Mr. and tables without clear column or row headers. Sands now shops on the online grocery site a few times a month.

There are as yet no universally accepted categorizations of disability. . and can be temporary or chronic. These lists of barriers are illustrative and not intended to be comprehensive. however. Sometimes different disabilities require similar accommodations. physical or cognitive functioning which can affect access to the Web. Different Disabilities that Can Affect Web Accessibility This section describes general kinds of disabilities that can affect access to the Web. Abilities can vary from person to person. Commonly used disability terminology varies from country to country and between different disability communities in the same country. For example. People can have combinations of different disabilities. The term "disability" is used very generally in this document. Barrier examples listed here are representative of accessibility issues that are relatively easy to address with existing accessibility solutions. with an infrared connection. and over time. and combinations of varying levels of severity. despite efforts towards that goal. Many accessibility solutions described in this document contribute to "universal design" (also called "design for all") by benefiting non-disabled users as well as people with disabilities.which are laid out as long tables with clear row and column headersthat she uses to orient herself even when she has magnified the screen display. They may. Some people with conditions described below would not consider themselves to have disabilities. hearing. except where otherwise noted. memory. Aging-related conditions can be accommodated on the Web by the same accessibility solutions used to accommodate people with disabilities. Occasionally she also uses her portable braille device. The number and severity of limitations tend to increase as people age. Following is a list of some disabilities and their relation to accessibility issues on the Web. support for speech output not only benefits blind users. 3. Each description of a general type of disability includes several brief examples of the kinds of barriers someone with that disability might encounter on the Web. or motor function. someone who is blind and someone who cannot use his or her hands both require full keyboard equivalents for mouse commands in browsers and authoring tools. but also Web users whose eyes are busy with other tasks. This document does not attempt to comprehensively address issues of terminology. and a few times she has downloaded sample menus into her braille device so that she has them in an accessible format once she is in the restaurant. There is a trend in many disability communities to use functional terminology instead of medical classifications. while captions for audio not only benefit deaf users. and may include changes in vision. These may include injury-related and aging-related conditions. to get additional information and directions at a publicly-available information kiosk in a shopping mall downtown. since they both have difficulty using a mouse but can use assistive technologies to activate commands supported by a standard keyboard interface. but also increase the efficiency of indexing and searching for audio content on Web sites. have limitations of sensory. For instance. for different people with the same type of disability.

software that reads text on the screen (monitor) and outputs this information to a speech synthesizer and/or refreshable braille display. many individuals who are blind rely on screen readers -.. To access the Web. Some people who are blind use text-based browsers such as Lynx. or that do not have meaningful names forms that cannot be tabbed through in a logical sequence or that are poorly labelled browsers and authoring tools that lack keyboard support for all commands browsers and authoring tools that do not use standard applications programmer interfaces for the operating system they are based in non-standard document formats that may be difficult for their screen reader to interpret . instead of a graphical user interface browser plus screen reader. graphs or charts) that are not adequately described video that is not described in text or audio tables that do not make sense when read serially (in a cell-by-cell or "linearized" mode) frames that do not have "NOFRAME" alternatives. or voice browsers. uncorrectable loss of vision in both eyes.y y y y y visual disabilities o blindness o low vision o color blindness hearing impairments o deafness o hard of hearing physical disabilities o motor disabilities speech disabilities o speech disabilities cognitive and neurological disabilities o dyslexia and dyscalculia o attention deficit disorder o intellectual o memory impairments o mental health disabilities o seizure disorders multiple disabilities aging-related conditions disabilities y y Visual disabilities Blindness (scenario -."accountant") Blindness involves a substantial. They may use rapid navigation strategies such as tabbing through the headings or links on Web pages rather than reading every word on the page in sequence. Examples of barriers that people with blindness may encounter on the Web can include: y y y y y y y y y images that do not have alternative text complex images (e.g.

or speak clearly.Low vision (scenarios -. and they may or may not read a written language fluently. and whose contrast cannot be easily changed through user override of author style sheets text presented as images. because of inconsistent layout. Barriers that people with low vision may encounter on the Web can include: y y y y y Web pages with absolute font sizes that do not change (enlarge or reduce) easily Web pages that."teenager" and "retiree") There are many types of low vision (also known as "partially sighted" in parts of Europe). or rely on supplemental images to highlight context. some people with low vision use extra-large monitors. To use the Web. and clouded vision."shopper") Color blindness is a lack of sensitivity to certain colors. To use the Web. depending on the type and extent of visual limitation Color blindness (scenario -. many people who are deaf rely on captions for audio content. Some individuals use specific combinations of text and background colors. due to loss of surrounding context Web pages. which prevents wrapping to the next line when enlarged also many of the barriers listed for blindness. above. To use the Web. Barriers that people with color blindness may encounter on the Web can include: y y y color that is used as a unique marker to emphasize text on a Web site text that inadequately contrasts with background color or patterns browsers that do not support user override of authors' style sheets Hearing Impairments Deafness (scenario -. Some deaf individuals' first language is a sign language."online student") Deafness involves a substantial uncorrectable impairment of hearing in both ears. or images on Web pages. They may need to turn on the captions on an audio file as they browse a page. Sometimes color blindness results in the inability to perceive any color. such as a 24-point bright yellow font on a black background. or choose certain typefaces that are especially legible for their particular vision requirements. that have poor contrast. . central field loss (seeing only the edges of the visual field). concentrate harder to read what is on a page. are difficult to navigate when enlarged. and increase the size of system fonts and images. or between yellow and blue. tunnel vision (seeing only the middle of the visual field). Others use screen magnifiers or screen enhancement software. some people with color blindness use their own style sheets to override the font and background color choices of the author. Common forms of color blindness include difficulty distinguishing between red and green. for instance poor acuity (vision that is not sharp).

or other assistive technologies to access and interact with the information on Web sites. These conditions can affect the hands and arms as well as other parts of the body. people who are hard of hearing may rely on captions for audio content and/or amplification of audio. lack of coordination."reporter") Motor disabilities can include weakness. people with motor disabilities affecting the hands or arms may use a specialized mouse. which can slow comprehension for people whose first language may be a sign language instead of a written/spoken language lack of clear and simple language requirements for voice input on Web sites y y Hard of hearing A person with a mild to moderate hearing impairment may be considered hard of hearing. They may need to toggle the captions on an audio file on or off. head-pointer or mouth-stick. or missing limbs. joint problems. They may activate commands by typing single keystrokes in sequence with a head pointer rather than typing simultaneous keystrokes ("chording") to activate commands. a pointing device such as a head-mouse. including webcasts Physical disabilities Motor disabilities (scenario -. They may need more time when filling out interactive forms on Web sites if they have to concentrate or maneuver carefully to select each keystroke. including webcasts lack of content-related images in pages full of text.Barriers that people who are deaf may encounter on the Web can include: y y lack of captions or transcripts of audio on the Web. limitations of muscular control (such as involuntary movements. To use the Web. Barriers that people with motor disabilities affecting the hands or arms may encounter include: y y y time-limited response options on Web pages browsers and authoring tools that do not support keyboard alternatives for mouse commands forms that cannot be tabbed through in a logical order Speech disabilities . To use the Web. limitations of sensation. a keyboard with a layout of keys that matches their range of hand motion. Some physical disabilities can include pain that impedes movement. voicerecognition software. Barriers encountered on the Web can include: y lack of captions or transcripts for audio on the Web. or adjust the volume of an audio file. or paralysis). an eye-gaze system.

To use the Web. someone who has difficulty reading may use a screen reader plus synthesized speech to facilitate comprehension. an individual with an attention deficit disorder may need to turn off animations on a site in order to be able to focus on the site's content. They may also have difficulty with spatial orientation. Canada. or the lack of captions for audio Attention deficit disorder (scenario -. To use the Web. the U. and some other countries) and dyscalculia may have difficulty processing language or numbers.."classroom student") Individuals with visual and auditory perceptual disabilities.Speech disabilities Speech disabilities can include difficulty producing speech that is recognizable by some voice recognition software. Barriers that people with speech disabilities encounter on the Web can include: y Web sites that require voice-based interaction and have no alternative input mode Cognitive and neurological disabilities Visual and Auditory Perception (scenario -. someone with a speech disability needs to be able to use an alternate input mode such as text entered via a keyboard."classroom student") Individuals with attention deficit disorder may have difficulty focusing on information. For instance.S. including dyslexia (sometimes called "learning disabilities" in Australia. while someone with an auditory processing disability may use captions to help understand an audio track. Barriers that people with attention deficit disorder may encounter on the Web can include: y y distracting visual or audio elements that cannot easily be turned off lack of clear and consistent organization of Web sites . either in terms of loudness or clarity. Barriers that people with visual and auditory perceptual disabilities may encounter on the Web can include: y lack of alternative modalities for information on Web sites. To use parts of the Web that rely on voice recognition. people with visual and auditory perceptual disabilities may rely on getting information through several modalities at the same time. for instance lack of alternative text that can be converted to audio to supplement visuals. They may have difficulty processing spoken language when heard ("auditory perceptual disabilities").

or may have some loss of ability to recall language. and may benefit from the level of language on a site not being unnecessarily complex for the site's intended purpose. To use the Web. or have difficulty understanding complex concepts. including people with some types of epilepsy (including photo-sensitive epilepsy). are triggered by visual flickering or audio signals at a certain frequency."retiree") Individuals with memory impairments may have problems with short-term memory. missing long-term memory. Avoidance of these visual or audio frequencies in Web sites helps prevent triggering of seizures. Down Syndrome is one among many different causes of intellectual disabilities. . To use the Web. may rely more on graphics to enhance understanding of a site. people with seizure disorders may need to turn off animations. or to use screen magnifiers. Barriers can include: y y distracting visual or audio elements that cannot easily be turned off Web pages with absolute font sizes that do not enlarge easily Seizure disorders Some individuals with seizure disorders."supermarket assistant") Individuals with impairments of intelligence (sometimes called "learning disabilities" in Europe. people with intellectual disabilities may take more time on a Web site. or certain frequencies of audio.Intellectual disabilities (scenario -. blinking text. Barriers can include: y y y use of unnecessarily complex language on Web sites lack of graphics on Web sites lack of clear or consistent organization of Web sites Memory impairments (scenario -. To use the Web. To use the Web. people with mental health disabilities may need to turn off distracting visual or audio elements. Barriers can include: y lack of clear or consistent organization of Web sites Mental health disabilities Individuals with mental health disabilities may have difficulty focusing on information on a Web site. or "developmental disabilities" or previously "mental retardation" in the United States) may learn more slowly. or difficulty with blurred vision or hand tremors due to side effects from medications. people with memory impairments may rely on a consistent navigational structure throughout the site.

which they could access on a refreshable braille display. someone who is deaf and has low vision might benefit from the captions on audio files.Barriers can include: y use of visual or audio frequencies that can trigger seizures Multiple Disabilities (scenario -. When used with computers. 4. or configure the operating system so that multiple-keystroke commands can be entered with a sequence of single keystrokes. dexterity and memory. however when using screen magnification the user loses surrounding contextual information. multimedia players. Together. while someone who is blind can benefit from hearing an audio description of a Web-based video. Someone who cannot move his or her hands. or plug-ins."retiree") Changes in people's functional ability due to aging can include changes in abilities or a combination of abilities including vision. might use a combination of speech input and speech output."teenager") Combinations of disabilities may reduce a user's flexibility in using accessibility information. voice browsers. hearing. someone with low vision may need screen magnification. but only if the captions could be enlarged and the color contrast adjusted. For instance. Aging-Related Conditions (scenario -. For example someone who cannot see a Web page may tab through the links on a page as one strategy for helpinjg skim the content. for instance the ability to change the system font size. For example. these changes can become more complex to accommodate. and might therefore need to rely on precise indicators of location and navigation options in a document. Adaptive strategies are techniques that people with disabilities use to assist in using computers or other devices. Some accessibility solutions are built into the operating system. Some assistive technologies are used together with graphical desktop browsers. and someone who is deaf can benefit from seeing the captions accompanying audio. . Similarly. assistive technologies are sometimes referred to as adaptive software or hardware. text browsers. Any one of these limitations can affect an individual's ability to access Web content. and also cannot see the screen well. which adds to the difficulty which a user with short-term memory loss might experience on a Web site. Barriers can include any of the issues already mentioned above. Assistive Technologies and Adaptive Strategies Assistive technologies are products used by people with disabilities to help accomplish tasks that they cannot accomplish otherwise or could not do easily otherwise. someone who is both deaf and blind needs access to a text transcript of the description of the audio and video.

but rather explanations of examples highlighted in the scenarios above.g. Refreshable braille displays can be incorporated into portable braille devices with the capabilities of small computers. and sip-and-puff switches. An 8-dot version of braille has been developed to allow all ASCII characters to be represented. Refreshable or dynamic braille involves the use of a mechanical display where dots (pins) can be raised and lowered dynamically to allow any braille characters to be displayed. Braille systems vary greatly around the world."reporter") Alternate keyboards or switches are hardware or software devices used by people with physical disabilities. support a wide range of alternative modes of input. which can also be used as interfaces to devices such as information kiosks. y y y y y y y y y y y alternative keyboards or switches braille and refreshable braille scanning software screen magnifiers screen readers speech recognition speech synthesis tabbing through structural elements text browsers visual notification voice browsers Alternative keyboards or switches (scenario -.g. A user selects a desired item by hitting a switch when the desired item is highlighted or announced. At the same time screen magnifiers make presentations larger.. Screen magnifiers (scenarios -. links. phrases) one at a time. eyegaze keyboards."accountant" and "teenager") Braille is a system using six to eight raised dots in various patterns to represent letters and numbers that can be read by the fingertips. Scanning software Scanning software is adaptive software used by individuals with some physical or cognitive disabilities that highlights or announces selection choices (e.Following is a list of the assistive technologies and adaptive strategies described below. with no mouse required.." "ble" in Grade II American English braille) in order to make braille more compact. they also reduce the area of the document that may be viewed."teenager" and "retiree") Screen magnification is software used primarily by individuals with low vision that magnifies a portion of the screen for easier viewing. Examples include keyboard with extra-small or extralarge key spacing. menu items. . Some "grades" of braille include additional codes beyond standard alpha-numeric characters to represent common letter groupings (e. that provide an alternate way of creating keystrokes that appear to come from the standard keyboard. Web-based applications that can be operated entirely from the keyboard. Braille and refreshable braille (scenarios -. on-screen keyboards. This is by no means a comprehensive list of all such technologies or strategies. "th. keyguards that only allow pressing one key at a time.

whether because they are blind or dyslexic -. Older screen readers make use of the rendered version of a document. Speech synthesis (speech output) (scenario -. They can be used with screen readers for people who are blind."accountant") Some accessibility solutions are adaptive strategies rather than specific assistive technologies such as software or hardware.may tab through items on a page. so that document order or structure may be lost (e."accountant") Speech synthesis or speech output can be generated by screen readers or voice browsers.removing surrounding context . when tables are used for layout) and their output may be confusing."accountant" and "teenager") Software used by individuals who are blind or who have dyslexia that interprets what is displayed on a screen and directs it either to speech synthesis for audio output. Screen readers (scenarios -. as well as people using voice recognition. one strategy for rapidly scanning through links. Applications that have full keyboard support can be used with speech recognition. Some screen readers use the document tree (i. Some screen magnifiers offer two views of the screen: one magnified and one default size for navigation. Tabbing through structural elements (scenario -. headers. list items. Visual notification Visual notification is an alternative feature of some operating systems that allows deaf or hard of hearing users to receive a visual alert of a warning or error message that might otherwise be issued by sound ..e.. People who are using screen readers -. People who are used to using speech output sometimes listen to it at very rapid speeds. or to refreshable braille for tactile output. for people who cannot use a mouse. For instance. Text browsers Text browsers such as Lynx are an alternative to graphical user interface browsers.g. They are also used by many people who have low bandwidth connections and do not want to wait for images to download. and involves production of digitized speech from text. Speech recognition Speech (or voice) recognition is used by people with some physical disabilities or temporary injuries to hands and forearms as an input method in some voice browsers. or other structural items on a Web page is to use the tab key to go through the items in sequence. the parsed document code) as their input.

You're Reading a Free Preview

/*********** DO NOT ALTER ANYTHING BELOW THIS LINE ! ************/ var s_code=s.t();if(s_code)document.write(s_code)//-->