Presupposition In the branch of linguistics known as pragmatics, a presupposition (or ps) is an implicit assumption about the world or background belief relating

to an utterance whose truth is taken for granted in discourse. Examples of presuppositions include: Do you want to do it again? Presupposition: that you have done it already, at least once. Jane no longer writes fiction. Presupposition: that Jane once wrote fiction. A presupposition must be mutually known or assumed by the speaker and addressee for the utterance to be considered appropriate in context. It will generally remain a necessary assumption whether the utterance is placed in the form of an assertion, denial, or question, and can be associated with a specific lexical item or grammatical feature (presupposition trigger) in the utterance. Crucially, negation of an expression does not change its presuppositions: I want to do it again and I don't want to do it again both presuppose that the subject has done it already one or more times; My wife is pregnant and My wife is not pregnant both presuppose that the subject has a wife. In this respect, presupposition is distinguished from entailment and implicature. For example, The president was assassinated entails that The president is dead, but if the expression is negated, the entailment is not necessarily true. If presuppositions of a sentence are not consistent with the actual state of affairs, then one of two approaches can be taken. Given the sentences My wife is pregnant and My wife is not pregnant when one has no wife, then either: Both the sentence and its negation are false; or Strawson's approach: Both "my wife is pregnant" and "my wife is not pregnant" use a wrong presupposition (i.e. that there exists a referent which can be described with the noun phrase my wife) and therefore can not be assigned truth values. Bertrand Russell tries to solve this dilemma with two interpretations of the negated sentence: "There exists exactly one person, who is my wife and who is not pregnant" "There does not exist exactly one person, who is my wife and who is pregnant." For the first phrase, Russell would claim that it is false, whereas the second would be true according to him. Projection of presuppositions A presupposition of a part of an utterance is sometimes also a presupposition of the whole utterance, and sometimes not. We've seen that the phrase my wife triggers the presupposition that I have a wife. The first sentence below carries that presupposition, even though the phrase occurs inside an embedded clause. In the second sentence, however, it does not. John might be mistaken about his belief that I have a wife, or he might be deliberately trying to misinform his audience, and this has an effect on the meaning of the second sentence, but, perhaps surprisingly, not on the first one. John thinks that my wife is beautiful. John said that my wife is beautiful. Thus, this seems to be a property of the main verbs of the sentences, think and say, respectively. After work by Lauri Karttunen,[1] verbs that allow presuppositions to "pass up" to the whole sentence ("project") are called holes, and verbs that block such passing up, or projection of presuppositions are called plugs. Some linguistic environments are intermediate between plugs and holes: They block some presuppositions and allow others to project. These are called filters. An example of such an environment are indicative conditionals ("If-then" clauses). A conditional sentence contains an antecedent and a consequent. The antecedent is the part preceded by the word "if," and the consequent is the part that is (or could be) preceded by "then." If the consequent contains a presupposition trigger, and the triggered presupposition is explicitly stated in the antecedent of the conditional, then the presupposition is blocked. Otherwise, it is allowed to project up to the entire conditional. Here is an example: If I have a wife, then my wife is blonde. Here, the presupposition triggered by the expression my wife (that I have a wife) is blocked, because it is stated in the antecedent of the conditional: That sentence doesn't imply that I have a wife. In the following example, it is not stated in the antecedent, so it is allowed to project, i.e. the sentence does imply that I have a wife. If it's already 4am, then my wife is probably angry. Hence, conditional sentences act as filters for presuppositions that are triggered by expressions in their consequent. A significant amount of current work in semantics and pragmatics is devoted to a proper understanding of when and how presuppositions project. Presupposition triggers A presupposition trigger is a lexical item or linguistic construction which is responsible for the presupposition [2]. The following is a selection of presuppositional triggers following Stephen C. Levinson's classic textbook on Pragmatics, which in turn draws on a list produced by Lauri Karttunen. As is customary, the presuppositional triggers themselves are italicized, and the symbol » stands for ‘presupposes’.[3] Definite descriptions Main article: Definite description Definite descriptions are phrases of the form "the X" where X is a noun phrase. The description is said to beproper when the phrase applies to exactly one object, and conversely, it is said to be improper when either there exist more than one potential referents, as in "the senator from Ohio", or none at all, as in "the king of France". In conventional speech, definite descriptions are implicitly assumed to be proper, hence such phrases trigger the presupposition that the referent is unique and existent. John saw the man with two heads. »there exists a man with two heads. Factive verbs In Western epistemology, there is a tradition originating with Plato of defining knowledge as justified true belief. On this definition, for someone to know X, it is required that X be true. A linguistic question thus arises regarding the usage of such phrases: does a person who states "John knows X" implicitly claim the truth of X? Steven Pinker discusses the usage of the phrase "having learned" as an example of a factive verb in George W. Bush's statement that "British Intelligence has learned that Saddam Hussein recently sought significant quantities of uranium from Africa." The factivity thesis, the proposition that relational predicates having to do with knowledge, such as knows, learn, remembers, and realized, presuppose the factual truth of their object, however, was subject to notable criticism by Allan Hazlett. Martha regrets drinking John's home brew. »Martha drank John's home brew. Frankenstein was aware that Dracula was there. »Dracula was there. John realized that he was in debt. »John was in debt. It was odd how proud he was. »He was proud. Some further factive predicates: know; be sorry that; be proud that; be indifferent that; be glad that; be sad that. Implicative verbs John managed to open the door. »John tried to open the door. John forgot to lock the door. »John ought to have locked, or intended to lock, the door.

You can't get gobstoppers anymore. out of the blue explain that my wife is a dentist. »The flying saucer came before. »Someone kissed Rosie. Iteratives The flying saucer came again. the rest of social science was asleep. Since Churchill died. he slipped). sentences may carry presuppositions that are not part of the common ground and nevertheless be felicitous. Who is the professor of Linguistics at MIT? »Someone is the professor of Linguistics at MIT. as (as in As John was getting up. and that this has been mentioned in the previous discourse. Several million people had dinner in New York last night. We have just seen that presupposition triggers like my wife (definite descriptions) allow for such accommodation. Kissinger continued to rule the world. etc. Possessive case John's children are very noisy. John had dinner in New York last night. interaction. upon being introduced to someone. given what we know about New York. the grammatical component of syllabuses may be improved by use of principles and parameters. Further iteratives: another time. »»Kissinger had been ruling the world. Presupposition triggers that disallow accommodation are called anaphoric presupposition triggers. something parallel to what is stated has happened. Comparisons and contrasts Comparisons and contrasts may be marked by stress (or by other prosodic means). UNIVERSAL GRAMMAR AND LANGUAGE ACQUISITION . What is needed for the sentence to be felicitous is really that somebody relevant to theinterlocutors had dinner in New York last night. Cleft sentences Cleft construction: It was Henry that kissed Rosie. come. »Joan hadn't been beating her husband. or having any reason to believe that I have a wife. »For Marianne to call Adolph a male chauvinist would be to insult him. carry on. or ought to V. certain aspects of vocabulary are also crucial. the addressee must assume that I have a wife.[7] the philosopher Saul Kripke noted that some presupposition triggers do not seem to permit such accommodation. »The notice didn't say ‘mine-field’ in English. and inequality" operate in speech acts(including written text)—"text and talk". Carter returned to power. even in the absence of explicit information that it is. »Carter held power before. Some further change of sate verbs: start. For example. dominance. But that presupposition. and then HE insulted HER. X avoided Ving»X was expected to. For example. Learning the core aspects of a second language means re-setting values for parameters according to the evidence the learner receives. restore. Accommodation of presuppositions A presupposition of a sentence must normally be part of the common ground of the utterance context (the shared knowledge of the interlocutors) in order for the sentence to be felicitous. we've lacked a leader. we would never have lost poor Llewellyn. not only in the sense of being analytical. roughly. Implications for the classroom can only be drawn for core areas of grammatical competence. Sometimes. »Churchill died. this without my addressee having ever heard. as stated. or by comparatives constructions. Marianne called Adolph a male chauvinist. enter. for the nth time. active production or comprehension. even if this reveals what does not need to be taught. While Chomsky was revolutionizing linguistics. In order to be able to interpret my utterance. Further temporal clause constructors: after. perhaps starting from the L1 setting. cease. »John has children. to come back. This process of an addressee assuming that a presupposition is true.Some further implicative predicates: X happened to V»X didn't plan or intend to V. »John had been beating his wife. by particles like “too”. too. as may the teacher’s awareness of language. take (as in X took Y from Z » Y was at/in/with Z). is usually called presupposition accommodation. Joan began beating her husband. Questions Is there a professor of Linguistics at MIT? »Either there is a professor of Linguistics at MIT or there isn't. CDA is critical. arrive. go. In "Presupposition and Anaphora: Remarks on the Formulation of the Projection Problem". finish. consciousness-raising and hypothesis-testing are irrelevant. Classroom acquisition depends crucially on the provision of appropriate syntactic evidence to trigger parameter-setting. or usually did. Carol is a better linguist than Barbara. »Strawson was born. An example of that is the presupposition trigger too. »Barbara is a linguist. Counterfactual conditionals If the notice had only said ‘mine-field’ in English as well as Welsh. however. but also in the ideological sense. »John lost something. Pseudo-cleft construction: What John lost was his wallet. whenever.[8] Van Dijk (2003) says CDA "primarily studies the way social power abuse. during. Change of state verbs John stopped beating his wife. Frege noticed presuppositions. and that in itself doesn't satisfy the presupposition of the sentence. »You once could get gobstoppers. Existing textbooks already supply appropriate evidence for parameter-setting."[8] One notable feature of ideological presuppositions researched in CDA is a concept termed synthetic personalisation This paper explores the implications of the principles and parameters theory of Universal Grammar for language teaching. or that this information can be recovered from it. repeat. is completely trivial. etc. Presupposition in Critical discourse analysis Critical discourse analysis (CDA) seeks to identify presuppositions of an ideological nature. Temporal clauses Before Strawson was even born. the following sentence triggers the presupposition that somebody other than John had dinner in New York last night. I can. leave. Variability. »Chomsky was revolutionizing linguistics. if pronounced with emphasis on John.[8] Van Dijk describes CDA as written from a particular point of view: [8] "dissendent research" aimed to "expose" and "resist social inequality. This word triggers the presupposition that.

like English and French (“non-pro-drop” languages). the crucial evidence must be freely available to all children rather than a select few. such as Hawkins (1983. “Max played the drums” is grammatical because the verb occurs in the correct head-first position.e. Acquisition in the UC model. The concerns that linguists have within this model relate chiefly to the nature of the evidence that the learner needs to encounter. which has looked at earlier models of language acquisition with different emphases or has looked at syntactic issues that are not directly relevant to this model. rather than from negative evidence such as correction or sentences people do not say. So. 1985) yet acquire language. while all languages have the same principles of phrase structure.e. a Prepositional Phrase such as “with Charlie Parker” must have a head that is a preposition. Some principles lay down the relationship between items that have been “moved” in the sentence. direct and indirect objects).parameters in the learner’s mind. and the preposition comes after its complement (and so is known as a postposition). is intransitive). no human language breaches them. Adjective Phrases. does it actually happen?) and uniformity (is it available to all children?). the English passive reflects the combined effects of principles of syntactic movement. principles of phrase structure require every phrase in it to have a head of a related syntactic category and permit it to have complements of various types. and “Charlie Parker” must have noun heads and may. hearing “John ate an apple” the child learns it is head-first in English. but in this case do not. Noun Phrases such as “Max”. or “relative clauses”. this is different not only from the type of Universal Grammar studied by those concerned within the implicational universals tradition.) It is not a tenet of the theory that the whole of Universal Grammar is necessarily present from the start. Principles do not vary from one language to another. and this depends upon the lexical item that is used. and of case. SORRY FIG NOT AVAILABLE Fig. Kahuli children for example are not treated as conversational partners for the first few years of life (Schieffelin. 1987). which has to be set to a different position for certain languages (the “marked” setting) but not for others. the crucial aspects of a language for the learner to master are the appropriate settings for the parameters. The difference between the phrase structures of different languages lies in the order in which head and complement occur within the phrase. or any particular construction as such.e. it does not deal with the “passive”. in English the head verb comes before the complement. or whatever. Parameters confine the variation between languages within circumscribed limits. described for instance in Chomsky (1988) and Cook (1988). as the eyes are present. it is necessary to specify that the present article considers Universal Grammar (UG) within the current Chomskyan model. we need to know whether a complement is actually allowed. and to the starting position for. consequently pro-drop is the “unmarked” setting. all that has to be learnt is whether the setting for the head parameter is head-first or head-last. compared to “Max played”. others concern the ways in which words such as “himself” may or may not corefer with the same entity as other words in the sentence (the Binding Principles). Hyams (1986) took the example of the “prodrop” parameter-whether a language permits subjectless declarative sentences. The learner needs to hear relevant evidence for setting the parameters of the grammar. instead rules are seen as the interaction of various principles and settings for parameters. since the learner already knows the principles as they are part of his or her mind. nothing more than this framework is needed to describe acquisition-no learning strategies. and Prepositional Phrases. it is normally transitive). The setting for the parameter might be neutral and so equally settable for any language. Gboudi. subject to other constraints on the child’s use of language.Given the widely differing interpretations of Universal Grammar. hence it is sometimes called the “principles and parameters” model. This is not true only of English. which maintain that all principles and parameters are equally present at all times. Due to the Projection Principle the acquisition of vocabulary means not just learning the meanings and pronunciations of words but also learning what structures the words can be used in. they do not have to be learnt. The question of whether the phrase structure of a sentence is grammatical is a matter not just of whether it conforms to the overall possible structures in the language but also whether it conforms to the particular structures associated with the lexical items in it. as captured in the Government/Binding theory of syntax (Chomsky. In addition the evidence available has to meet the requirements of occurrence (i. but also from much of the L2 discussion of Universal Grammar. The settings for parameters are not constant but vary from one language to another. that is to say naturally occurring sentences. 1981. not in terms of rules. Current UG theory describes the speaker’s knowledge of language in terms of principles and parameters. 1. have complements. 1988. The fact that something is dictated by our genes does not mean it is necessarily present from the start. the adjective before its complement (“easy to play”). while that for the verb “give” must specify that it has two complements (i. The interpretation of acquisition in which the child creates hypotheses that are modified in the light of feedback is no longer accepted since such appropriate feedback has never been found. This variation between languages is captured by the head parameter. “the drums”. 1988). The Universal Grammar theory claims that the speaker’s knowledge of a language such as English consists of several such general principles and of the appropriate parameter settings for that language. and “growth” models which hold that principles unfold in a developmental sequence. Japanese is the opposite in that the head verb comes after the complement in the Verb Phrase. as in “E wa kabe ni kakatte imasu” (picture wall on is hanging). since virtually all normal children learn their first language. all that is needed is sufficient evidence to set the values for the parameters. Given the learner knows the phrase structure principles. Complementary to these phrase structure principles is the Projection Principle which claims that syntax and the lexicon are closely tied together. “play”. children learning English or French have to set the pro-drop parameter away from its first setting while those learning Chinese or Spanish can retain the original setting. or there might be a preferred position (the “unmarked” setting). Since it relies on triggering from evidence. Instead it is neutral between “no-growth” models. nevertheless a bare sentence or two may suffice to demonstrate how a parameter should be set. the phrases of all languages consist of heads and possible complements Japanese. hearing “Mukashi mukashi ojihisan to obaasan ga koya ni sunde imashita” (once upon a time an old man and old woman cottage in lived) the child learns that the setting is head-last in Japanese. non-pro-drop the marked. As well as knowledge of where the complement goes in the phrase. they differ in their setting for the head parameter. compared to “Max the drums played” and because the verb “play” has an Object Noun Phrase following it. A simplified picture of acquisition in the UG model is then as shown in Fig. the lexical entry for the verb “faint” must specify it has no complement (i. each of which also applies to other areas of the grammar. for a fuller discussion see Cook (1989a). It does not matter whether the learner is faced with Japanese or English. As the principles of UG are built-in to the mind. and may have a complement “the drums”. hence the Projection Principle states that the English verb “play” must be specified as taking a complement (i. instead it may reveal itself over time. thus the crucial point to learn about the verb “play” is that it is used with a following object . and a complement “Charlie Parker”. “with”. Noun Phrases. a single sentence such as “Max played the drums with Charlie Parker” may be enough to set the head parameter for English and thus impart a knowledge of how to construct Verb Phrases. To take an English example sentence “Max played the drums with Charlie Parker”. any theory of language acquisition cannot therefore rely on particularly beneficial conversation with adults.“play something”. which has two settings “head-first” and “head-last” according to whether the head comes before or after the complement in the phrases of the language. UG does not have a learning theory as such. (See Cook. Alongside this there is massive learning of vocabulary in a particular form. 1. The input to the child is vital for triggering parameter-setting since nothing would happen without it. motivations. and the noun before its complement (“belief that he can play well”). of phrase structure. For this the learner needs linguistic evidence in the form of actual sentences spoken by the people around him or her.e. The model of acquisition is essentially straightforward. Catalan. . The second issue is the initial setting for parameters. because they are built-in to the human mind. as in questions and passives (Subjacency Principle). A Verb Phrase such as “played the drums” must have a head that is a verb. Many arguments have suggested that the learner must be able to learn solely from “positive” evidence. the same principles of phrase structure apply. She argued that children start from the pro-drop setting in that their early sentences in all languages omit subjects. like Chinese and Spanish (“pro-drop” languages) or does not permit them. the learner automatically applies them to whatever language he or she encounters. and so on. Cook. cognitive or social schemas. interesting as this question may be in its own right. 1989b for a different interpretation of pro-drop. as milk teeth yield to permanent teeth and finally to wisdom teeth. This model is not centrally concerned with conventional “rules”. the head preposition comes before its complement.

partly because teachers are unacquainted with the pro-drop and head parameters (see Cook. which mediates L2 via Sorry FIG not available Fig. The question of evidence is more open since unlike L1 children many L2 learners receive copious correction of their errors and grammatical explanation. The classroom learner is setting values for parameters from positive evidence. The case of L2 learning may be slightly different in that L2 learners may be more developed in all the aspects except language than the L1 child learner. they have no relevance to UG related areas. Neither for first nor for second language acquisition can it be said that UG acquisition models are based on extensive empirical research within the principles and parameters framework. understanding the “message” in the sentence or taking part in a mutually constructed dialogue: hearing the sentence is enough. But evidence from actual children is not of prime importance to the theory for two reasons. slower speed (Mannon. there should be no intrinsic difference between classroom acquisition and any other form of involved in UG theory. and the absence in L2 learners. The UG theory is arguably of minor importance in dealing with how people communicate. Does a Japanese learner approach English with a head-last setting for the head parameter or is he or she neutral between settings? This reintroduces the issue of transfer into L2 learning research in a new form. the theory claims that acquisition research can establish what must be built-in to the mind without reference to an actual child at all by comparing what the speaker knows with the possible language evidence he or she has encountered. not acquisition. (1985) is beside the point. if we can show that a speaker knows something about language. or how their language behaviour varies from one situation to another. What would such appropriate evidence consist of? On the one hand there is the extreme case where it is believed that learning may take place on the basis of one sentence or a small set of sentences. however. this can be put as a choice between a “direct access” model that suggests that UG is still available for L2 acquisition. Any sentence from a child we try to study is a product of development. and cognitive stage. First of all. much of the ensuing discussion will be concerned with reminding the reader that UG theory is neutral about many of the issues that arise in the classroom. The input of language sentences and the output of language knowledge are the same in the L1 and L2 models. Adjective Phrase. namely the relationship between UG and L2 learning. It would be misleading to attempt to draw conclusions from UG theory for anything other than the central area that is its proper domain. however. 1986). and a “no access” model in which UG is no longer available for L2 learning (Cook. much of it that purports to deal with Universal Grammar is dealing with areas of syntax that are not principle-based or. AP. so far as researchers can tell. and McLaughlin (1987). UG theory does not regard language acquisition as depending upon particular circumstances. On the whole then the well-established features of teacher-talk-shorter utterances (Wesche and Ready. Interlanguages seem to stay within the limits of possible human languages rather than to go against the UG principles. The no access model logically leads to treating language like any other area of learning and so in school terms to dealing with it in the same way as say geography and gymnastics. social development. rather than in how the speaker uses language. that is to say Spanish learners of English assume it is prodrop. 1988 for further discussion). then we can demonstrate it must be part of the speaker’s mind-the “poverty of the stimulus” argument. 1985). and dependent on the child’s memory capacity. 2. say the phrase structure principles. provided the learner has sufficient examples of appropriate sentences to trigger the settings for the various parameters. since the only type of grammar that may be entertained by the language faculty of the human mind must conform to the principles of UG. as with the head parameter example. of grammars that breach principles of UG. 2. hence work with learners strategies in the classroom such as those enumerated by O’Malley et al. even if not in the same way as in the first language or the second language or in either of them. It is not necessary for the learner to do anything in particular. even if these appear to be not quite the settings that UG theory utilizes (Cook. the uniformity and occurrence requirements mean that it deals with features that can be learnt regardless of situation. such as reduced short-term memory capacity. indeed strictly speaking it is about grammar rather than about language. using actual children’s use of language for learning about acquisition necessitates disentangling the thread of language acquisition from all the others with which it is interwoven. it is dubious. French learners that it is non-pro-drop. parameter setting will take place. This article is not the place to survey the contribution of UG theory to L2 learning in general. The learner’s grammar will conform in one way or another to the principles of UG. rather than both groups starting from the same position. grammatical competence. UG AND CLASSROOM LEARNING The UG model is primarily about language knowledge. in progress. At all stages the learner’s interlanguage will . 1986). and that this could not be worked out from the sentences the learner hears. in which language acquisition is combined with physical. as it does not reflect acquisition of any of the core areas of syntax. On the other hand some of the necessary evidence may be indirect. The picture of L2 learning can be diagrammed as in Fig. social and cognitive development. Verb Phrase. pragmatic competence. and linguistic. However important the concepts of variability and the context of language acquisition with respect to the type of language knowledge language learning may be to other areas of L2 learning research (Ellis and Roberts. While the effects of this on the knowledge of parameters appear minimal. called in Cook (1989b) “onetime setting”. an “indirect access” model that claims it is only available via the mediation of the L1. The present argument presupposes that UG theory is relevant to L2 learning. The arguments against “no access” are briefly the difference of the language system from other cognitive systems. when they are. regardless of variation between learners. all of which have an indirect connection to language acquisition proper. When looking at the relevance of UG theory to classroom learning we need to remember its restricted scope-general principles of syntax such as the phrase structure principles and precise areas of variation such as the pro-drop or head parameters. PP. Secondly the theory separates the idealized picture of acquisition that is its concern from the history of the child’s actual development. the intervening parameter-setting differs according to whether one adopts the direct access model. they cannot be entirely dismissed. Preposition Phrase. Hyams (1986) believes that the crucial element in the English child’s switching to non-pro-drop is not the absence of subjectless declarative sentences themselves. Furthermore it is concerned with the abstract central areas of syntax rather than with broader aspects of language. Again an overall question arises. so that language knowledge would be acquired with difficulty via alternative routes. since teachers do not have the academic knowledge to correct errors or make explanations on the basis of such syntactic principles and parameters.Second language researchers have had similar concerns. say. as UG theory implies (Cook. which is a by-product of the non-pro-drop setting and absent from pro-drop languages-a form of positive evidence. The question of parameter-setting in L2 learning is interesting because there is already one setting for the parameters present in the learner’s mind. Most L1 and L2 research has dealt with “rules” not “principles”. not language use. and regardless of types of input. L2 learning. L2 development is still not. VP. To set the parameter correctly sometimes requires a range of syntactic forms rather than just one paradigm sentence. Classroom second language learning and teaching is made up of many components-psychological. magnified by the problems peculiar to second language learning. UG theory can play only one part in this framework. except in so far as they segment the input more readily into grammatical constituents. immune to the effects of other cognitive deficits in a second language. which treats L1 and L2 entirely differently. so long as the classroom provides appropriate evidence. in L2 learning this may be modified by any transfer of parameter setting from the L1 . or language development. its interests lie in what the speaker knows about language. the question is how much influence this exerts on L2 learning. Furthermore L2 learners may be exposed to forms of evidence such as explanation or correction which are rare in L1 acquisition. or how they meet and understand other people. a form of negative evidence. Flynn (1988). Ellis (1985). as outlined above. 1986). NP. or the indirect access model. less subordination (Ishiguro. not based on the actual principles proposed within the Government/Binding theory. Noun Phrase. broadly similar accounts will be found in Cook (1988). social. Nor can L2 knowledge be derived from particular types of interaction or behaviour by the learner. If UG is involved in L2 learning. does the L2 learner transfer the L1 setting to the new language or start from scratch? Research by White (1986) on the pro-drop parameter suggests that the first language setting is carried over to the second. 1987). L1 setting for parameters. whether they receive correction of the appropriate errors or grammatical explanations of the right type to learn the types of syntactic knowledge. but the presence of the “expletive” subjects “there” and “it”. Lightbown and White (1988). 1989a). indeed experiments with Micro Artificial Languages have shown that learners can choose appropriate settings for the head parameter from around 30 sentences. and so on-have nothing to do with the desirable properties of input for a UG model. specifically looking narrowly at the principles and parameters version of UG. something at present impossible. Morgan. 1988).

Much L2 learning research has prided itself on discovering sequences of acquisition. the difficulty. without such feedback the learner would never know whether the hypothesis were correct. Thus Error Analysis misses the point if it emphasises the selfcontained internal system of the learner’s language rather than seeing it as one of the possible instantiations of human language. generalized to the L2 situation. 1984). UG also emphasizes the importance of vocabulary. Thus for instance it is unjustifiable to invoke UG theory or indeed any Chomskyan view of language acquisition as supporting the provision of explicit rules to the learner. learner languages vary within the finite possibilities set by UG. The UG model is then neutral so far as the interaction in the classroom is concerned. The UG approach may indeed tackle the most profound areas of L2 acquisition. The study of classroom L2 learning needs to operate within a framework that includes not only a linguistic model of acquisition such as UG but also psychological models of speech processing. functions. practice or production is neither good nor bad as the parameter is either set or it isn’t. Far from the claim of the standard Critical Period Hypothesis that there is a cut-off point for language acquisition. since it results from the setting of a handful of parameters. The comparative simplicity of syntax learning is achieved by increasing the burden of vocabulary learning. Nor is it possible to interpret UG theory as supporting the hypothesis-testing theory in the form in which it became familiar in L2 research and language teaching-the learner makes a hypothesis about the grammar. in that all the principles are present in the minds of the older L2 learners. not just in the conventional way of knowing their dictionary meaning or pronunciation but also in knowing the way they behave in sentences. there might be a tendency to start with “unmarked” settings (in so far as these are not in any case synonymous with “learnt earlier”). since the rest is innate. language development. 1978). and so cannot be an essential component of L1 acquisition. the first conversation the students hear . 1986). whether teachers vary the types of question (Long and Sato. a possibility denied by the current theory. UG is proof against situation and against learner variation because of its central importance. and far from the usual claim of the Monitor Model that acquisition can take place at any time while conscious learning may occur only after a certain age (Krashen. But this tendency is likely to be obscured in the L1 by the gross developmental changes in the child’s other attributes. it is learning that in English the verb “play” needs to be followed by a Noun Phrase. or engage in the three-fold classroom moves of Sinclair and Coulthard (1975) is irrelevant so far as the UG areas of syntax are concerned as these are not acquirable by such means. once these are established. the model has no way for conscious knowledge to become unconscious. perhaps these are precisely the parts of language that have to be learnt. But in first language acquisition correction of the appropriate syntax is not universally provided. foreign accent. and an endless list. A few sentences are all that is needed to set parameters. whatever its merits on other grounds (Rutherford. Cromer (1987) for example showed that giving children 10 examples of sentences illustrating the “eager/easy to please” construction every three months without telling them if they were right or wrong brought them way ahead of their peers. but the necessary input may consist of a handful of sentences.. it would suggest that L2 learning were taking place through some faculty other than the language faculty. He or she needs however to acquire an immense amount of detail about how individual words are used. the parameter settings probably need rather little attention. grammatical explanation may work for aspects of grammar that are not the central factors of UG. 1984) to see what linguistic evidence it provides the students. the UG principles are not learnt. “Proper” language knowledge must be derivable via triggering from positive input. or active production by the learners. as if a sequence were itself an explanation rather than a fact that needed to be explained. or provide corrective feedback of various types (Day et al. and indeed much of the interest. it cannot be taken to support or deny various positions that are outside its remit. Like other contemporary linguistic theories. such uses are not part of UG which is concerned with knowledge of language grammatical competence. common for instance in language teaching methods from the audiolingual to the communicative. as first argued by Braine (1971). UG theory is simply neutral. a sociolinguistic model of discourse interaction. motivation. The argument here has implied that UG is only one component out of many in L2 learning. The crucial point so far as UG is concerned is that the appropriate input for triggering particular parameters be available to the learner. such peripheral areas as the acquisition of closed-class grammatical morphemes may well yield to such treatment. nor whether it conveys a message. 1969). Unit 1 of the course concentrates on “My name’s . nor the properties of the input in other syntactic terms. situational purpose. a questionnaire I administered to 351 students of English found that they placed the statement “I want to learn more English words and phrases” second out of 10 possible aims for their English course. and that the resulting knowledge acquired was “language-like” rather than true grammatical competence. Some research has shown that a minimal amount of data may turn the switch for the learner. It has often been reported that learners feel vocabulary to be particularly important (Hatch. regardless of its wild deviancy from the L1 or the L2. this suggests that an L2 learning theory cannot depend solely on a particular type of interaction that is highly idiosyncratic and does not occur in situations where some L2 learners have been shown to be successful. So far as the learning of other aspects of language than the syntactic core. pragmatic competence is multi-functional and includes a communicative function as well as others. There is no warrant for seeing “consciousness-raising” in the form of explicit statements of grammatical rules to learners as having anything to do with UG theory. if not perhaps in the way that either learners or teachers presently conceive of it. 1983). On the other hand the complexity. . But. there may be a difference between older L2 learners and younger L2 learners or L1 learners. Let us take a modern beginners course The Cambridge English Course (Swan and Walters. If the “growth” model of UG is accepted. The UG theory has nothing in common with models that stress amount of practice. and an educational model of the values and purposes of language teaching. those that are central to language and to the human mind. And of course whatever explanations were vouchsafed to learners would need to be in terms of principles (which they already possess unconsciously anyway) rather than of construction-specific rules. It may be that communicative goals imply other forms of learning. 1987). or life in England. The main UG theory is neutral about L1 sequence. But such views cannot be accommodated within the areas of language acquisition covered by the UG theory itself. UG theory clearly has little to say about many of the controversies about classroom language learning. We come then to the question of sequence of development. the acquisition of older L2 learners will reflect UG better than that of younger L1 or L2 learners since they would have all the principles simultaneously present. The uniformity requirement of the UG model insists in addition that whatever it is that fosters acquisition in the input is freely available in all L1 situations. in L2 learning may be the aspects that are not predictable from UG theory-learner variation. A major learning component according to the UG theory will indeed be vocabulary.“. it is not just a matter of the beginner in English learning the syntax. If such evidence were to help learners to set parameters in L2 acquisition. tries it out and modifies it in the light of how successful it is. Again this is not to say that such explanations would not work for aspects of grammar or of language outside the UG purview. The evidence for setting the head parameter needs to be sentences showing Object complements following verbs rather than preceding them. is it possible to venture some simple concrete applications to language teaching? So far as classroom interaction is concerned we have seen that the most that the UG theory would recommend is the provision of adequate language samples for parameter setting to take place. On the one hand this is indeed proof of their central importance to language learning. there may be rather little to say about them. and cognitive development. where the learner needs to acquire masses of words. triggering implies no deeper processing than syntactic and lexical codebreaking. so far as the active comprehension advocated by supporters of “Listening First” methodologies (Cook. not the amount of input in terms of quantity. “It must be recognized that one does not learn the grammatical structure of a second language through ‘explanation and instruction’ beyond the most rudimentary level for the simple reason that no one has enough explicit knowledge about this structure to provide explanation and instruction” (Chomsky. in the L2 by the more subtle deficiencies in the learner’s other cognitive systems when using the L2. Knowledge of language is not conscious. and meaning of “He plays football”. after “I want to practice English so that I can use it outside the classroom”. Needless to say. always provided that UG is in fact available to the L2 learner.reflect UG principles. 1982). if a growth model of UG is correct and UG is still available. Having produced so many caveats. a parameter is a switch with two or more discrete positions rather than a steadily increasing response strength. there is indeed the necessity for the learner to be aware of syntactic categories and of vocabulary “meanings” but there seems no particular need for the depth of semantic processing suggested by such models. function. The L2 learner needs to spend comparatively little effort on phrase structure. Such a process requires feedback to the learner concerning the correctness of his or her temporary hypothesis. . this shows how linguistic evidence supplied at the right time may facilitate acquisition of syntax. The provision of input is crucial to acquisition. and some way above structures. Though not couched in terms of the current UC theory.

John Dirkx and Patricia Cranton have researched "soulfulness" and the process of intuition and the unconscious on our meaning-making. Above all we need to reinterpret the grammatical syllabus in terms of principles rather than separate rules or “structures”. through true emancipation from sometimes mindless or unquestioning acceptance of what we have to come to know through our life experience. any small sample of English sentences must reflect this basic fact. transformative learning theory has expanded to include what Jung considers the "unconscious functions. namely that it has to be followed by a complement. being aware of what is taken care of by UG can free the teacher to pay attention to other things that actually need teaching. which collects information on the English spoken by 19 different groups of learners. and so bound to be present in the input. UG may be of help at one stage removed from the student. Laissez-faire works because of the accidental reason that the necessary evidence is simple and common. and Thai. it isn’t”. Unit 9 introduces “weather report”. in adulthood we may more clearly understand our experience when we know under what conditions an expressed idea is true or justified. utterances. Unit 10 teaches dummy “it” in “It’s a man”. and personalities may predispose us towards. “There will be snow”. One volume of the Cambridge Handbooks for language teachers is Learner English (Swan and Smith. the syllabus often gives the impression of consisting of discrete grammatical items-the present tense. a crucial generalization is being overlooked. Constellations of syntactic structures might be combined that so far have been widely separated. especially those things that our culture. without our active engagement and questioning of how we know what we know. 2000. shaping and maintaining intersubjectivity • Relating events. 99). memorizing tax codes or learning historical facts and data. . Developing more dependable beliefs of our experience.has two examples of Object Noun Phrases following “is”. the presence of expletive subjects such as “it” and “there”. Unit 7 in such sentences as “There’s some water in the big field”. But secondly the use of the Government/Binding model of syntax associated with UG theory can provide insights to the teacher confined to the traditional models of grammar used in language teaching. being able to develop this self-authorship goes a long way towards helping our society and world to become a better place through our greater understanding and awareness of the world and issues beyond us. “it” in “It rains from January to March” and “It’ll cloud over tomorrow”. 85). Firstly. To a UG theorist these all reflect the pro-drop parameter. Furthermore the student is learning properties of the verb “is”. We make meaning with different dimensions of awareness and understanding. everyone knows that. Turning to the pro-drop parameter the evidence needs to be the absence of null subject sentences. together with “there”. relative clauses. CORE PRINCIPLES OF TRANSFORMATIVE LEARNING THEORY . critical reflection on assumptions. Recent approaches to transformative learning also include transformation through our intuitive. And it would be surprising if it weren’t. and can help us to improve our role in our lives and those of others. for Spanish “subject personal pronouns are largely unnecessary” (p. Introduction to Transformative Learning Transformative Learning is a theory of deep learning that goes beyond just content knowledge acquisition. standards. Again everything necessary to set the parameter is introduced within the first few weeks of the course. Teachers I have spoken to have indeed found the two examples of the head parameter and the pro-drop parameter useful insights that help them to understand their students. and validated meaning by assessing reasons. 66). The argument in favour of this originally was that our ignorance of language acquisition meant we interfered with it at our peril. Context Context – justification of much for much of what we know and believe. In so far as grammar forms part of contemporary syllabuses. Making Meaning as a Learning Process Adult Learning needs to emphasize contextual understanding. and Spanish (p. Bruner identified four means of meaning making: • Establishing. 1988): leave the student alone so that the natural processes of his or her mind can get to grips with language. unconscious processes.4). This is no longer the case so far as current UG theory is concerned: the contents of the speaker’s mind and the evidence necessary for acquisition are known. our values and our feelings. and behavior to the action taken • Construing of particulars in a normative context – deals with meaning relative to obligations. and. similar comments are made about Greek. historical and cultural – in which they are embedded. one of the major aspects of the phrase structure of English.MEZIROW & OTHERS. While transformative learning theory originally consisted of critical self-reflection and disorienting dilemmas to make cognitive adjustments to reframing one's world. 2000. say. conformities and deviations • Making propositions – application of rules of the symbolic syntactic. . In the absence of fixed truths and confronted with often rapid change in circumstances. for setting the pro-drop parameter. Facts that are part of general principles don’t need to be taught. 232) and “Not only interrogatives but also other sentences with inverted word order are also error-prone” (p.“‘. 232). 1987). Portuguese. according to Hyams. object-attribute. religions. to some extent this already takes place since the teacher is not aware of the general principles or specific parameters that he or she has been covering: ignorance is bliss. yes/no questions. Take the case of prodrop. p. In other words in the first minutes of the course the student is given sufficient information for setting the head parameter. the only possible confusion is the use of questions such as “Is your name Mark Perkins?” in the same context where the Object is separated from the verb by the Subject. Our understandings and beliefs are more dependable when they produce interpretations and opinions that are more justifiable or true than would be those predicated upon other understandings or beliefs (Mezirow. something eschewed by all EFL course books. 79). and the use of “seem’‘-grammatical topics whose common factor almost certainly has never been emphasized in teaching! The use of the concept of parameters by teachers depends upon a decision whether L1 parameter-settings are transferred or not. even if they are not infrequent in ordinary spoken English for performance reasons. both at the general level of the need for positive evidence. A student does not need to learn that a phrase always has a head of a related syntactic category because. which tends to drop them when they may be understood” (p. the definite article-rather than the interlinked knowledge that the UG theory suggests. ‘Hello my name’s . for Italian we learn “use of the subject pronoun is not obligatory” and “the order of subject and predicate is freer than in English” (p. the first practice for the students is “Say your name. 3-4). these are all critical to adult learning. this would involve at least wh-questions. and identity-otherness • Transformative Learning – (Mezirow addition) Becoming aware of one's own tacit assumptions and expectations and those of others and assessing their relevance for making an interpretation (Mezirow. quite literally. there is the teacher’s awareness of language and of syntax. the passive. A traditional interpretation of Chomskyan thinking to the classroom is what I have termed elsewhere“laissez-faire” (Cook. together with remarks about missing “it” in Portuguese (p. and conceptual systems used to achieve decontextualized meanings. . p. except in short answers “No. It is a desirable process for adults to learn to think for themselves. p. depends on the context – biographical. we cannot fully trust what we know or believe (Mezirow. 4). For us as adults to truly take ownership of our social roles. Behind the classroom stands the syllabus. 2000. or learning equations. Finally as a slightly tangential point. for instance suppose that we wanted to teach “movement”. just as it is hard for any small sample not to use all the phonemes of English. 85) and “subject-verb and verb-subject do not regularly correspond to statement and question respectively” (p. considering them in the context of our lives and being able to improve our decision-making based on our insights. and our personal roles." Dirkx and Cranton expand on this in their work. including rules of inference and logic and such distinctions as wholepart. and at the specific level of the need for “expletive” subjects. Teachers are missing an important insight if they see these as separate bits of information about different languages rather than as a two-way variation in languages. one consequence of UG might be that the division of what needs and doesn’t need to be taught can be based on a notion of principles and parameters. Unit 5 of the Cambridge course introduces existential “there” in such sentences as “There’s an armchair in the living room”. A glance at current syllabuses may disclose areas which can be eliminated for these reasons. for Chinese “English uses pronouns much more than Chinese.

2000. a particular religious world view –this often requires a critical assessment of assumptions supporting the justification of norms (Mezirow. read and comprehend • (Metacognition) monitor own progress and products as they are engaged in first order cognitive tasks • Epistemic cognition – explain how humans monitor their problem solving when engaged in ill structured problems – limits of knowledge. It is all about win or loose and unfortunately for most of us. Weis brings up the intuitive process. to maintain a balanced budget. we need to consider alternatives – what are we willing to give up. or those we consider essential. is that specialized use of dialogue devoted to searching for a common understanding and assessment of the justification of an interpretation or belief (Mezirow. and to be able to reflect on new points of view and information and often go back and reconstruct what we know and how we know it. 2000 p 4-5). Since it is here where our sense of self and our values are interwoven. intentions. A Frame of Reference has two dimensions: • A Habit of Mind – Broad based assumptions that act as a filter for our experiences. guilt. Sometimes this may be expressed as our point of view (Mezirow. looking and reflecting on alternate points of view and often creating a new. we must be mindful of viewpoints that challenge our beliefs. Inspiration. Goleman's research shows that emotional intelligence accounts for about 87% of success at work! (Mezirow. p. 2000. p. and making a personal judgment based on a new assessment of the information. and that the current leaders have failed to responsibly managed budgets. thus leading us to make a new frame for how we see and experience them. Dirkx writes of learning through soul involving a focus on the interface where the socioemotional and the intellectual world meet. or shame • A critical assessment of assumptions • Recognition that one's discontent and the process of transformation are shared • Exploration of options for new roles. knowing and managing our own emotions and motivating ourselves as well as recognizing emotions in others and handling relationships. or alternative beliefs. kinesthetic. 6) Art.. as are our beliefs and views may be. conventional wisdom. 2000. implied as subtext. inspiration. p 17 & 18). where inner and outer converge (Mezirow. We must become critically reflective of the assumptions of the person communicating. 11). transcendence. An example might be having a conviction that a tax levy in your community might be superfluous. Heron discusses a type of learning. Mezirow suggests transformations come about due to one of four ways: • Elaborating Existing Frames of Reference • Learning New Frames of Reference • Transforming Points of View • Transforming Habits of the Mind Transformations often follow some variation of the following phases of meaning becoming clarified: • A Disorienting Dilemma – loss of job. marriage. p. In Communicative learning we need to be mindful of assessing meaning behind the words. learning styles. then reflecting critically on the new information. Reflective Discourse In the context of Transformative Learning Theory. Domains of Learning Habermass identified two major domains of learning with different purposes. or moving to a new culture • Self-examination with feelings of fear. criteria of rationality. Meaning Structures – Or a Frame of Reference This is our structure of assumptions and expectations (including our cultural assumptions often received as repetitive affective experiences outside of our conscious awareness) through which we filter sense impressions. politicians may tell the electorate that any new taxes are unnecessary. and being open to looking at alternative points of view. since we are questioning our own points of view.Kitchener – Three levels of cognitive processing: • compute. called Presentational – where we do not require words to make meaning (ex. music and dance are alternative languages. or realize that if someone expresses a contrary or different point of view. attitudes. 10). . logics of inquiry. This provides us with a more dependable way to make meaning within our lives. and modes of validating beliefs. more reliable and meaningful way of knowing that may be different from our old habits of the mind. social norms. 2000. or what Daniel Goleman calls emotional intelligence. Intuition. world view. etc. and think we know why they do or do not do what we expect. In this formulation. values and moral issues. Recent studies reveal that for effective discourse in transformative learning we need emotional maturity. 2000. in our political realities in our country. through assessing the evidence and arguments of a point of view or issue. and transcendence are central to self-knowledge and to drawing attention to the affective quality and poetry of human experience. usually through reconstructing dominant narratives or stories. divorce. not about seeking common ground. • Resulting Point of View – These include our points of view. anger. This often involves feelings. beliefs and judgments. we are not under personal attack especially. or the unconscious acquisition of knowledge. empathy. empathy. 2000. back to school. This requires us to become open to others points of view. memorize. 22).6). At this point. emerges in late adolescence. transformative learning pertains to epistemic cognition (Mezirow. we might find out that inflation and other economic factors (including loss of tax paying companies to our area) may have undermined our community's effort to continue to provide essential services. or how high are we willing to go with new taxes to continue our services? Mezirow reminds us of our need to find collective experience and arrive at a best decision. feeling. relationships and actions • Planning a course of action • Acquiring knowledge and skills for implementing one's plans • Provisional trying of new roles • Building competence and self-confidence in new roles and relationships • A reintegration into one's life on the basis of conditions dictated by one's new perspective (Mezirow. • Instrumental learning – learning to control and manipulate the environment or other people (task oriented problem solving to improve performance) • Communicative learning – learning what others mean when they communicate with you. Assumptions include intent. This is about making personal understanding of issues or beliefs. much more sophisticated and rapid than conscious capacity (Mezirow. Art. form may change during adult years (Mezirow. aesthetic). or what are we willing to reduce in scope. truthfulness and qualifications of the speaker and authenticity of expressions of feeling. music. we make judgments about others. and dreams are other ways of making meaning. This flies in the face of what Deborah Tannen (1998) calls our argument culture and we find evidence of this daily. these include: moral consciousness. and perhaps these organizations are actually being fairly well managed. p. Transformations – A process whereby we move over time to reformulate our structures for making meaning. philosophies including religion. 2000. imagination.5). Often. p. our artistic tastes and personality type and preferences. If we actually seek data. To develop common ground we need to help others and ourselves move from self serving debate and move towards empathetic listening and informed constructive discourse. These processes are another approach from the purely rational and cognitive lens. p. If we are truly open to understanding we might engage them in dialogue and through discussion find out that our sense of them and their issues may be totally erroneous. 2000 p. 9).

repetition. or by transforming habits of mind. logical. or may pertain to other aspects of experience. 23). Critical to teachers helping effect transformative learning in adults. Relationship with other methods and approaches Historically. Transformative learning therefore involves the transformation of frames of reference (points of view. feelings.compute. scientific. values. by transforming points of view. in doing a study on women returning to school as adults. economic. Early theorists including Jean Piaget and Maria Montessori. we transform frames of reference -. However there are several reasons to consider transformative learning theory and practice for students (particularly adolescents) in schools and colleges. social.by becoming critically reflective them of their assumptions and aware of their context… Assumptions on which habits of mind and related points of view are predicated may be epistemological. worldviews) and critical reflection on how we come to know. by learning new frames of reference. It is closely tied to behaviorism.” “. Jack et al. ecological. groups.” "Transformative learning refers to transforming a problematic frame of reference to make it more dependable . memorise. and habit-formation central elements of instruction. clear thinking decision makers. ideological. is the understanding of the importance of supportive relationships in the adult students lives. has gained considerably in popularity. social or educational)..monitoring progress and products of first order thinking Transformative Learning . organizations and nations. Emerges in late adolescents. Transformative Learning Theory provides a structure and process through which to better understand adult growth and development. Mezirow. and the criteria for knowing. Students who understand transformative learning may be better able to recognise the common stages of transformative change and have the tools to assist them during this process. a theory that started with Mezirow and has been greatly enriched by many others.. cultural. It is also referred to as “communicative approach to the teaching of foreign languages” or simply the “communicative approach”. We are living through a period of transformational change in society and culture. an organization or workplace. In summary. Task-based language learning. CLT has been seen as a response to the audio-lingual method (ALM). The audio-lingual method The audio-lingual method (ALM) arose as a direct result of the need for foreign language proficiency in listening and speaking skills during and after World War II. Otherwise students may feel disempowered. The transition to adult life often involves personal transformation as students move from a safe school environment to take on complex work. and meanings rather than those we have uncritically assimilated from others -to gain greater control over our lives as socially responsible.our own and those of others -. And cognitive processing involves three levels: First Order Thinking . by generating opinions and interactions that are more justified. Transformative Learning What is Transformative Learning? According to Mezirow learning occurs in one of four ways: by elaborating existing frames of reference. Proponents of ALM felt that this emphasis on repetition needed a corollary emphasis on accuracy. a system (economic. claiming that continual repetition of errors would lead to the fixed acquisition of incorrect structures and non-standard pronunciation. They are: Objective Reframing – involving critical reflection on assumptions of others encountered in a narrative or task oriented problem solving Subjective Reframing – involving critical self-reflection of one's own assumptions about narrative (applying reflective insight from someone else's narrative to one's own experience. We become critically reflective of those beliefs that become problematic. the certainty of knowledge. p. a more recent refinement of CLT. From Mezirow: “Transformation theory's focus is on how we learn to negotiate and act on our own purposes. fear change. read and comprehend Metacognition . As we ask students to develop critical and reflective thinking skills and encourage them to care about the world around them they may decide that some degree of personal or social transformation is required. or develop a degree of cynicism towards those who promote change. ethical. who may be experiencing transformative learning. Having a safe and supportive system of teachers and other significant people may greatly facilitate the student's willingness to move forward with transformative learning. and thus madedrilling. discovered much of what we now know as Transformative Learning Theory. Communicative language teaching (CLT) is an approach to the teaching of second and foreign languages that emphasizes interaction as both the means and the ultimate goal of learning a language. and as an extension or development of the notionalfunctional syllabus. When students are led to a deeper understanding of concepts and issues their fundamental beliefs and assumptions may be challenged leading to a transformation of perspective or worldview. psychological.” Ref. .. political. Transformative learning equips students with the concepts and understanding necessary to make a success of this transition. (2000) Learning as Transformation Do we need to be more explicit about Transformative Learning? Until recently Transformative Learning has largely been the province of adult learning theory.When we speak of reframing we are speaking of two different means of reframing. Students will be better able to understand and deal with such change if they understand the nature of transformation and the impact it has on individuals. habits of mind. or spiritual.reflecting on the limits of knowledge.. 2000. Students will need the tools of transformative learning in order to be effective change agents. become pessimistic about the future. study and social responsibilities. developed very thorough theories about childhood development and for years few scholars probed how adults learn and make meaning of their lives until Jack Mezirow. feelings and interpersonal relations (counseling or psychotherapy) and the way we learn (Mezirow.

lessons were often organized by grammatical structure and presented through shortdialogues. Often. For example. The form of the connections and the units can vary from model to model. CLT is usually characterized as a broad approach to teaching. they almost always follow two basic principles regarding the mind: Any mental state can be described as an (N)-dimensional vector of numeric activation values over neural units in a network. it is most often defined as a list of general principles or features. These five features are claimed by practitioners of CLT to show that they are very interested in the needs and desires of their learners as well as the connection between the language as it is taught in their class and as it used outside the classroom. In a notional-functional syllabus. as well as judicious use of grammar and pronunciation focused activities. This observation may call for new thinking on and adaptation of the communicative approach. but that this in turn will lead to further communication. but in terms of “notions” and “functions. In the mid 1990s the Dogma 95 manifesto influenced language teaching through the Dogme language teaching movement. and eventually CLT as the most effective way to teach second and foreign languages. Spreading activation In most connectionist models. An enhancement of the learner’s own personal experiences as important contributing elements to classroom learning. not all courses that utilize the Communicative Language approach will restrict their activities solely to these. There are many forms of connectionism. Basic principles The central connectionist principle is that mental phenomena can be described by interconnected networks of simple and often uniform units. cognitive psychology. Memory is created by modifying the strength of the connections between neural units. As an example. Some courses will have the students take occasional grammar quizzes. which is a numerical value intended to represent some aspect of the unit. At any time. Ordinary linguistic behaviour characteristically involves innovation. most notably pronunciation. formation of new sentences and patterns in accordance with rules of great abstractness and intricacy". networks change over time. though CLT has also been defended against this charge (e. or prepare at home using non-communicative drills. are generally represented as an (N×N)-dimensional matrix. the notion party would require numerous functions like introductions and greetings and discussing interests and hobbies. and advocated the notional functional syllabus. students listened repeatedly to recordings of conversations (for example. They looked for new ways to present and organize language instruction. Often. Harmer 2003[4]). if the units in the model are neurons. cognitive science. instruction is organized not in terms of grammatical structure as had often been done with the ALM. But. Much research using neural networks is done under the more general name "connectionist". Under this broad umbrella definition. As such.g. Though there is a large variety of neural network models. The introduction of authentic texts into the learning situation. the “notion” or context shopping requires numerous language functions including asking about prices or features of a product and bargaining. role-plays in which students practice and develop language functions.g. Noam Chomsky argued "Language is not a habit structure.” In this model. For example. the teacher will understand errors resulting from an influence from their first language. a unit in the network has an activation. An attempt to link classroom language learning with language activities outside the classroom. advocates of audio-lingual methods point to their success in improving aspects of language that are habit driven. The adapted communicative approach should be a simulation where the teacher pretends to understand only what any regular speaker of the target language would and reacts accordingly (Hattum 2006[5]). and it is very common in connectionist models used by cognitive psychologists. Spreading activation is always a feature of neural network models. One of the most recognized of these lists is David Nunan’s (1991) five features of CLT: An emphasis on learning to communicate through interaction in the target language. Moreover.[1] Classroom activities used in CLT Example Activities Role Play Interviews Information Gap Games Language Exchanges Surveys Pair Work Learning by teaching However. in the classroom CLT often takes the form of pair and group work requiring negotiation and cooperation between learners. but the most common forms use neural network models. the communicative approach is deemed a success if the teacher understands the student.[2] Henry Widdowson responded in defense of CLT. The provision of opportunities for learners to focus. The notional-functional syllabus Main article: Notional-functional syllabus A notional-functional syllabus is more a way of organizing a language learning curriculum than a method or an approach to teaching. A closely related and very common aspect of connectionist models is activation. Proponents of the notional-functional syllabus claimed that it addressed the deficiencies they found in the ALM by helping students develop their ability to effectively communicate in a variety of real-life contexts. if the teacher is from the same region as the student. the activation could represent the probability that the neuron would generate an action potentialspike. a “notion” is a particular context in which people communicate. The connection strengths. not only on language but also on the Learning Management process. Native speakers of the target language may still have difficulty understanding them. that models mental or behavioral phenomena as theemergent processes of interconnected networks of simple units. If the model is a spreading activation model. This communication may lead to explanation. . Learning by teaching (LdL) Learning by teaching is a widespread method in Germany (Jean-Pol Martin). then over time a unit's activation spreads to all the other units connected to it. fluency-based activities that encourage learners to develop their confidence. Bax[3]) have critiqued CLT for paying insufficient attention to the context in which teaching and learning take place. or "weights". Connectionism is a set of approaches in the fields of artificial intelligence. also in the ELT Journal (1985 39(3):158-161). for instance. However. rather than as a teaching method with a clearly defined set of classroom practices.In the classroom. More recently other writers (e. Neural networks Neural networks are by far the most commonly used connectionist model today. neuroscience and philosophy of mind. Similarly. any teaching practice that helps students develop their communicative competence in an authentic context is deemed an acceptable and beneficial form of instruction. who proposed that published materials can stifle the communicative approach. As such the aim of the Dogme approach to language teaching is to focus on real conversations about real subjects so that communication is the engine of learning. Critiques of CLT One of the most famous attacks on communicative language teaching was offered by Michael Swan in the English Language Teaching Journal in 1985. audio-lingual methodology is still prevalent in many text books and teaching materials. and a “function” is a specific purpose for a speaker in a given context. The students take the teacher's role and teach their peers. units in the network could represent neurons and the connections could represent synapses. in the language lab) and focused on accurately mimicking the pronunciation and grammatical structures in these dialogs. Critics of ALM asserted that this over-emphasis on repetition and accuracy ultimately did not help students achieve communicative competence in the target language. Thus.

published in 1969. have argued that connectionist models will evolve towards fully continuous. These aspects are now the foundation for almost all connectionist models. A pattern of connectivity among units. that is. but PDP became popular in the 1980s with the release of the books Parallel Distributed Processing: Explorations in the Microstructure of Cognition . and the distributed nature of neural representations. try to model the biological aspects of natural neural systems very closely in so-called "neuromorphic networks". This has been criticized[1] as reductionist. An activation rule for combining inputs to a unit to determine its new activation. represented by sets of activation vectors for some subset of the units. Steven Pinker and others. represented by a function on the current activation and propagation. in a Boltzmann machine.Volume 1 (foundations) and Volume 2 (Psychological and Biological Models). Rumelhart and the PDP Research Group. It wasn't until the 1980s that connectionism became a popular perspective among scientists. Earlier work PDP's direct roots were the perceptron theories of researchers such as Frank Rosenblatt from the 1950s and 1960s. Relational networks have been only used by linguists. Hebbian learning. and Karl Lashley. dynamic systemsapproaches. represented by a vector of functions on the activations. distributed systems. represented by a set of integers. It was an artificial neural network approach that stressed the parallel nature of neural processing. although the term "connectionism" is not used in the books. Herbert Spencer's Principles of Psychology. As early as 1869. for example in the 1940s and 1950s. It demonstrated the limits on the sorts of functions which single layered perceptrons can calculate. there was a reaction to it by some researchers. The framework involved eight major aspects: A set of processing units. Edward Thorndike was experimenting on learning that posited a connectionist type network. Thus. all cognitive processes can be explained in terms of neural firing and communication. The PDP books overcame this limitation by showing that multi-level. and it is now common to fully equate PDP and connectionism. non-linear. Computationalism is a specific form of cognitivism which argues that mental activity is computational. connectionists have created many sophisticated learning procedures for neural networks. There are also hybrid connectionist models. Psychological theories based on knowledge about the human brain were fashionable in the late 19th century. the activation is interpreted as the probability of generating an action potential spike. first made popular in the 1980s. They were influenced by the important work of Nicolas Rashevsky in the 1930s. Definition of activation: activation can be defined in a variety of ways. Many connectionist principles can be traced to early work in psychology. Some researchers argued that the trend in connectionism was a reversion towards associationism and the . Learning always involves modifying the connection weights. Biological realism The neural network branch of connectionism suggests that the study of mental activity is really the study of neural systems. David E. and were never unified with the PDP approach. Walter Pitts. Many earlier researchers advocated connectionist style models. But perceptron models were made very unpopular by the book Perceptrons by Marvin Minskyand Seymour Papert. Connectionists are in agreement that recurrent neural networks (networks wherein connections of the network can form a directed cycle) are a better model of the brain than feedforward neural networks(networks with no directed cycles). McCulloch and Pitts showed how neural systems could implement first-order logic: their classic paper "A Logical Calculus of Ideas Immanent in Nervous Activity" (1943) is important in this development here. such as the connectionist Paul Smolensky. Many recurrent connectionist models also incorporate dynamical systems theory. These generally involve mathematical formulas to determine the change in weights when given sets of data consisting of activation vectors for some subset of the neural units. connectionists have many tools. These tended to be speculative theories. Connectionism vs. computationalism debate As connectionism became increasingly popular in the late 1980s. mostly mixing symbolic representations with neural network models. Backpropagation. Following from this lead. and models involve varying degrees of biologicalrealism. A learning rule for modifying connections based on experience. as it was being developed. that the mind operates by performing purely formal operations on symbols. is probably the most commonly known connectionist gradient descent algorithm today. That is. high-dimensional. Another form of connectionist model was the relational network framework developed by the linguist Sydney Lamb in the 1960s. As a result. Friedrich Hayek proposed that spontaneous order in the brain arose out of decentralized networks of simple units. All gradient descent learning in connectionist models involves changing each weight by the partial derivative of the error surface with respect to the weight. Learning algorithm: different networks modify their connections differently. Donald Olding Hebb. Generally. In the 1950s. any mathematically defined change in connection weights over time is referred to as the "learning algorithm". An activation for each unit. McClelland. that is still used today. 3rd edition (1872). This links connectionism to neuroscience. An environment which provides the system with experience. Parallel distributed processing The prevailing connectionist approach today was originally known as parallel distributed processing(PDP). Many researchers. including Jerry Fodor. For example. andSigmund Freud's Project for a Scientific Psychology (composed 1895) propounded connectionist or proto-connectionist theories. represented by a change in the weights based on any number of variables. showing that even simple functions like the exclusive disjunction could not be handled properly. By formalizing learning in such a way. A perceived limitation of PDP is that it is reductionistic. and is determined via a logistic function on the sum of the inputs to a unit. by James L. Connectionism apart from PDP Though PDP is the dominant form of connectionism. such as that of William James. and proposed a learning principle. represented by a matrix of real numbers indicating connection strength. computational neuroscientists. which were little more than speculation until the mid-to-late 20th century. History Connectionism can be traced to ideas more than a century old. Hebb contributed greatly to speculations about neural functioning. They argued that connectionism. they are now used by very few researchers. Many authors find the clear link between neural activity and cognition to be an appealing aspect of connectionism. represented by a vector of time-dependent functions. represented by a function on the output of the units.Warren McCulloch. but some neural network researchers. A lot of the research that led to the development of PDP was done in the 1970s. The hybrid approach has been advocated by some researchers (such as Ron Sun). like a Turing machine. Learning Connectionists[citation needed] generally stress the importance of learning in their models. other theoretical work should also be classified as connectionist.Most of the variety among neural network models comes from: Interpretation of units: units can be interpreted as neurons or groups of neurons. Hayek's work was rarely cited in the PDP literature until recently. the neurologist John Hughlings Jackson argued for multi-level. It provided a general mathematical framework for researchers to operate in. was in danger of obliterating what they saw as the progress being made in the fields of cognitive science and psychology by the classical approach of computationalism. Lashley argued for distributed representations as a result of his failure to find anything like a localized engram in years of lesionexperiments. An output function for each unit. The books are now considered seminal connectionist works. Connectionist work in general need not be biologically realistic. A propagation rule spreading the activations via the connections. non-linear neural networks were far more robust and could be used for a vast array of functions. A very common strategy in connectionist learning methods is to incorporate gradient descent over an error surface in a space defined by the weight matrix. But by the early 20th century.

Therefore. These sub-networks will be connected to areas of the brain that control the phonological. has led to the successful modelling of a great many of these early problems. share many of the same nodes as other words.g. would be connected to hundreds or thousands of other nodes making up a mini-network for that word. such as psychology or philosophy of mind. The recent popularity of dynamical systems in philosophy of mind have added a new perspective on the debate. This was later achieved[citation needed]. Within a network the nodes are organized into 'levels' such that any one node excites or inhibits other nodes at its own or different levels. thus the debate persisted. trying to ensure that their models resemble neurological structures. and the debate about fundamental cognition has thus largely been decided amongst neuroscientists in favour of connectionism. whereas connectionist models are generally more opaque. Computationalists often posit domain specific symbolic sub-systems designed to support learning in specific areas of cognition (e. Connectionism and computationalism need not be at odds. and thus may be seen as contributing to our understanding of particular mental processes. habits and rules are not stored in these interconnections. Patterns. In this sense the debate might be considered as to some extent reflecting a mere difference in the level of analysis in which particular theories are framed. The recently proposed Hierarchical temporal memory model may help resolving this dispute. for example to the phonological. but what is stored are the interconnection strengths that allow these . However. to the extent that they may only be describable in very general terms (such as specifying the learning algorithm. intentionality. The debate largely centred on logical arguments about whether connectionist networks were capable of producing the syntactic structure observed in this sort of reasoning. does not contain many of the more advanced and sophisticated notions of connectionism (see Bechtel and Abrahamsen (1991) or Cohen et al. (1993) for reviews in this area). One could visualize that the representation of a word might involve interconnections between various parts of the network. language. so this is not a potential vindication of computationalism. But despite these differences. but in the interconnections between the nodes in the form of a network. etc. whereas connectionists engage in "low level" modeling. a broad theory of cognition (i. of course. Associationism dates from classical times but was substantially refined by the seventeenth century philosophers Hobbes and Locke. Connectionism as a paradigm of learning has its roots in associationism. This distribution information provides us with several advantages which will be discussed later. Each of these nodes can be connected to many different networks. In this sense connectionist models may instantiate. This is logically possible. Nonetheless. something they felt was mistaken. and thereby provide evidence for. while connectionists posit one or a small set of very general learning mechanisms. Part of the appeal of computational descriptions is that they are relatively easy to interpret. although using processes unlikely to be possible in the brain[citation needed]. There is no unified agreement on what exactly connectionism is. Today.e.another mini-network. a subnetwork of morphological knowledge can connect with a sub network of word roots. connectionism). whereas connectionists focus on learning from environmental stimuli and storing this information in a form of connections between neurons. hence the relationship to associationism. we do have some insights from our knowledge of the mental lexicon what it might look like. Future research may be able to clarify this knowledge. a well known word will have a very intricate network of interconnections and less well known words will have fewer interconnections. computational descriptions may be helpful high-level descriptions of cognition of logic. The knowledge is stored in these interconnections and is associated with other kinds of knowledge contained in the network and to other networks. These nodes are massively interconnected with other nodes to form a network of interconnections. Neural networks seek to explain cognition in biological or neurological terms and PDP tries to show that the information is not stored in the brain in one place but is distributed throughout the various parts of the brain which serve certain linguistic and non-linguistic functions. as it is well known that connectionist models can implement symbol manipulation systems of the kind used in computationalist models[citation needed]. Part 1: What is connectionism? Connectionism as a term was first mentioned in Thorndike's study (1898) of the way cats learn in incremental stages. Connectionists believe that these interconnections store the lexical information. Throughout the debate some researchers have argued that connectionism and computationalism are fully compatible. at least to some degree. Connectionism borrows heavily from associationism and is a term that covers neural networks and Parallel Distributed Processing (PDP). speech. but instead use the terms nodes and networks which are said to represent a crude but effective approximation of the neural state of the brain at a superficial level . however this does not mean that the information is stored in one place (one cannot look inside the brain and find a particular word for example). From the interaction of these inter-related networks we can form the meaning of a word and find the correct word to choose. these fairly recent developments have yet to reach consensus acceptance amongst those working in other fields. auditory functions as well as the storing of lexical-specific information. it was those very tendencies that made connectionism attractive for other researchers. without representing a helpful theory of the particular process which is being modelled. Connectionism is also based on this principle but is somewhat different in that it encompasses much more as outlined below. Computationalists believe that internal mental activity consists of manipulation of explicit symbols. The differences between the two approaches that are usually cited are the following: Computationalists posit symbolic models that do not resemble underlying brain structure at all. but the debate in the late 1980s and early 1990s led to opposition between the two approaches. Connectionist architectures of cognition are loosely based on the architecture of the brain. given that it explains how the neocortex extracts high-level (symbolic) information from low-level sensory input. Computationalists generally focus on the structure of explicit symbols (mental models) and syntacticalrules for their internal manipulation. or in unhelpfully low-level terms. The sum of all these interconnections for that word make up the knowledge about that word which the learner has. though full consensus on this issue has not been reached. the number of units.). A different word would have a different set of nodes connected to hold that information . as indeed they must be able if they are to explain the human ability to perform symbol manipulation tasks. or may not depending on the make up of that word. Associationism by contrast. Each sub-network making up a 'word' as such. which in turn can connect to a semantic sub network which stores meanings of words. These sub-networks store information that can be accessed by other sub-networks. While the exact make up of these interconnections is not known. some authors now argue that any split between connectionism and computationalism is more conclusively characterised as a split between computationalism and dynamical systems. But the debate rests on whether this symbol manipulation forms the foundation of cognition in general. It may. Connectionists do not use neurological terms such as synapses and neurons directly. If we have to find a past tense form for example. hence the term connectionism. number). Generally PDP and connectionism are seen as being synonymous. semantic or orthographic parts of the network. some theorists have proposed that the connectionist architecture is simply the manner in which the symbol manipulation system happens to be implemented in the organic brain. The fundamental belief of associationism is that learning could be regarded as the formation of associations between previously unrelated information based on their contiguity. whereas connectionists believe that the manipulation of explicit symbols is a poor model of mental activity.abandonment of the idea of a language of thought. In contrast. and general advances in the understanding of neural networks. however most connectionist models seem to share several properties. progress in neurophysiology. For example. In essence then our lexicon (or lexicons) is made up of hundreds or thousands of these subnetworks all massively interconnected to form the lexicon. From this we can see that the knowledge is distributed among many interconnections. the morphological network can be tapped to retrieve it. for example. Some connectionists believe that information is related to each other in the brain in the form of massively interconnected sub-networks rather than as a simple unified system.

'words to use when apologizing in German' and indeed many facets of vocabulary acquisition all linked together. At the other extreme. Advantages of a distributed network. Sometimes learners cannot comprehend a lexical item due to insufficient conceptual development or lack of background knowledge. is that the representation is incomplete and is only a partial representation of a learner's knowledge of see.reflecting her own view of the word see. for diagrammatic purposes only. . This model can show the interconnections (or lack thereof) to non linguistic knowledge that can hamper comprehension. these nodes could not exist at all. but nevertheless is systematic and which the learner is constantly updating (or has fossilized) (see Klein. Secondly. but in some completely different way . This means that if a learner is asked for a word that means 'a round. some of which are thin and some are thick. She is less sure about her knowledge (partial knowledge) that the pronunciation of the past tense of saw is /sØ:/ represented by the thinner line. therefore. partial and incorrect storage of lexical knowledge. 1986). both from the l1 and the L2 could also be accommodated here. the information is stored in a connectionist architecture can be accessed in many ways. This model can also demonstrate how we could instantiate knowledge from the network in a schematic way. Clearly the richer the network of associations. Another advantage of a distributed system is that if one part of the system deteriorates (for example a given word is known but temporarily cannot be recalled) the whole system does not break down as the forgotten word will be connected to other words which could replace it. The learner is relatively sure that see means something like 'an image comes to my eyes' and that it collocates with some objects. of course. if a learner had learned collapse but when called upon to produce it cannot access it. This models reflects this well as it can account for information that is not part of the L1 nor the L2. Alternatively. some of the models developed have been able to model at least some specific aspects of human performance (see Cohen et al.either thickly or thinly depending on the strength of that knowledge. It would take only a little imagination to conceive of a diagram which could represent knowledge of 'affix knowledge'. 1981 for an example). The learning of an L2 lexicon would involve deepening and enriching these networks and their interconnections. Clearly the L1 can intrude on the transfer of L1 lexical knowledge. therefore each node can connect to many networks. This is often called graceful degradation. This in turn allow the parallel processing of information where the brain can process many things at once. or asked for the meaning of 'soccer ball' he can answer from both directions. if a learner comes across a new word he may be able to guess from context prior lexical knowledge. Each learner will have a different network of associations and interconnections. For example. Clearly. the thinner the line represents less 'well-known' information. 'preposition use' and 'objects that collocate' with their sub-categories. See Bechtel and Abrahamsen (1991) Broeder and Plunkett (1994) Ney and Pearson (1990) for more detail in these areas. Such a highly interconnected network would. and Haberlandt. The first and most obvious. 1994 for numerous examples). Some SLVA researchers have proposed different lexicons serving different purposes such as productive or receptive L2 lexicons. Learning. That said. The stronger the interconnection (thicker line) the more 'well-known' the information is. Incremental learning Interlanguage phenomena point to a learner system whereby learning is incremental. This model can account for learner variation even with learners from the same L1 and with the same input having differing lexicons. such as 'meaning' are shown as being connected to other parts of the network by the lines leaving the diagram. The knowledge is stored in the interconnections. 'semantic networks. the more chance there will be of comprehension. 1993. nodes that have been labeled 'meaning' 'past tense ending'. Human-like behaviour. This make these models very attractive to psychologists in particular. Schema theory has shown us the importance of background knowledge and the relationship it has to comprehension (see Brewer and Treyens. a substitute could be found from within the network such as fall down. this learner may explicitly know that the past tense ending of see is not seed (is not /I:d/) This would be represented by drawing a line to that part of the diagram . Thirdly. Individual variation. Therefore. spontaneous generalization. What can the model demonstrate? Associative learning. For example.for example there are no nodes for its 'idiomatic use'. By the same token. Knowledge is seen at the micro structure level rather than macro-structure level of cognition. Partial knowledge The model can account for full. Generation from experience. One of the main achievements of a connectionist system is that it can process information and learn in ways which mirror some aspects of human learning and information processing such as. The diagram does not show all the other possible nodes about see . the strength of the interconnections reflects the relative knowledge one has about an item of vocabulary. Knowledge that things are not something can also be accounted for in this model. Therefore. In this model there is no reason to assume that sub-networks for separate lexicons could not exist side by side or be interconnected. 'words I have problems with'. Content addressability. and done in successive and / or recursive steps. reflecting no knowledge between these nodes and thus no knowledge of see. The reader should immediately notice several things about this network and the limitations of representing the network in diagrammatic form. or there could be no interconnections between them. Clearly we receive many different forms of input at any given time all of which must be process simultaneously . the network is immensely complex in structure. Each network of knowledge is connected to many other networks. the knowledge that see is pronounced the same as sea and so on. The learner could assign labels to these quite differently and in fact not even have them categorized as shown.patterns and rules to be recreated. each of the nodes and sub-categories. In addition. Therefore. pattern matching. hollow leather thing that you can play soccer with'. diagram 1 shows. The network thus has built in redundancy in that the capacity to continue correct operation despite the loss of part of the information comes from the fact that the original network had encoded more information than was necessary to maintain the network. One piece of lexical information connects to another and can instantiate a related idea or word (see Rumelhart and Ortony. Prototypical representations of the lexical environment emerge as a natural outcome of the learning process. is a by-product of processing. those who say there is only one lexicon for all lexical knowledge. the associative nature of vocabulary is shown here. stimulus categorization and concept learning. A major advantage of this is economy of the network in the sense that a single node could be connected to many others thus allowing one node to form many representations. the concept of see in diagram 1 is distributed among many interconnections. be beyond diagrammatic representation. Word knowledge in this network is content addressable . 1977 for a discussion of schema).

This matches the view that a new word will be not learned completely on first meeting. its collocates. but it cannot discriminate it from the relationship in semantic terms with 'The florist loves Joan'. but we could fall back on the idea of levels (to be outlined below). One thing missing in many discussions of connectionism is the conscious working mind . These are often referred to as higher cognitive functions. If the higher cognitive functions exist in PDP terms. register and so on) will incrementally grow with the number of times the word is met in various contexts. we constantly match new input to old information and adjust our knowledge store network according to the new information. As new information is added. Lack of lexical knowledge can be represented. we would need to be able to explain why there are parts of a PDP system which are transparent and why other parts are not. then when the learner meets an unknown word ending in '-ist' he can guess that it would be a person doing a certain type of work. Their example says that a PDP system can connect Joan. For example. Evidence from second language data. Alternatively. intention and higher cognitive functions. This lack of a developed network could help to account for why it is that learners are at a loss for words at times . It would therefore follow that lexical items which are not repeated or met frequently could have tenuous interconnections. he could create a novel word such as 'computerist'. PDP systems do in fact go through seems developmental stages as do first language learners. incrementally and through exposure to input. Therefore there needs to be constant practice and reinforcement. All languages have exceptions such as go / went and suru (do) and kuru (come in Japanese. for example confirming that we in fact say 'do the washing' and weakening interconnections of other parts of the mini-network making up 'do'. he might generate 'pianist' (if it was not known) from knowledge about piano. This would not mean that the collocation would be 'learned' but that a link had been made and probably the learner will continue to use 'do' in preference to 'commit' until the network has been so altered through repeated exposure. A network could add the representation but it could not disambiguate the two sentences. Purist connectionism views these as the by-products of the processing of information. It may be that we should not be trying to explain all things at all levels. This is despite Chomsky (1986) stating recently that the generatabiliity of syntax is no longer the goal of generative linguistics. practice and use to reflect the preference for 'commit' over 'do'. we could see that some interconnections were strong reflecting perceived well-known information (even if it is wrong) and others were weak reflecting less well known information. and -ist. For example.Connectionist systems have the ability to automatically or spontaneously generalize from experience. Learning under this model. they say that a PDP is inadequate to the task of representing syntactic knowledge. If a network representing say the word/concept 'do' were caught at rest. 217). loves and florist in 'Joan loves the florist' to give it meaning. give an elegant account of other phenomena as well' (Betchel and Abrahamsen. The challenge for these researchers then is to develop a system to 'account for the phenomena which are handled rather well by rules but also. Developmental sequences Fodor and Pylyshyn state that the model is not good at learning in developmental stages which a rule-based approach can capture. Imagine for a moment we could take a snap shot of the network at rest. Connectionist models are good at the lower-level of cognition such as content addressability. work. overgeneralization of lexical applications can be explained in these terms. if a learner knows the affix '-ist' can refer to a person doing a particular kind of job or work. Therefore. Universal application. then the learner could then connect 'crime' with 'commit' rather than with 'do' making a new interconnection. 1991 p. 1990) Shirai's work on L1 transfer (1992) and Marchman (1992) and Sokolik and Smith's work on critical periods (1992). . low level perception and spontaneous generalization. The strength of these interconnections is altered by the input strengthening some interconnections. if a learner wanted to generate a word '-ist' could be added to a person's job to create that word. memory. Successive steps in the learning process alter the associative interconnections by the strengthening or weakening of the interconnections.simply the network has not been set up or it contains insufficient knowledge. This ignores the fact that some aspects of language tend to be rule-governed and some aspects do not. This lack of work does not mean a lack of interest however and is understandable in tat the field is only 10 years old. whereas more traditional views of cognition (the current dominant experimental paradigm) see the mind as being somehow broken into parts. although many try to resist external explanations. For example. the stronger the interconnection that makes up that part of the word's knowledge. It will be a rare occasion that a new word is learned at one trial with all its features readily available for use. These quite obviously exist in some form or another as we can all say we have them. new interconnections are made to different nodes to account for this. This could be done both productively and receptively. Blackwell and Broeder's work on frequency (1992) Gasser's work on word order. without additional mechanisms. Beginning vocabulary learners often will not have an L2 network set up for some words/concepts let alone one that can find substitutes when needed. but the knowledge of that word (such as the pronunciation. Fodor and Pylyshyn (1988) argue that a connectionist system cannot capture the representation of syntax well. However there has been little success in discovering such examples at higher levels of cognition. (1988. An extensive search found no specific studies of second language vocabulary acquisition from a connectionist perspective. despite the fact that present connectionist simulations cannot do so. Capturing syntactic structure. As we learn. The more well known a piece of word knowledge is. Broeder and Plunkett's study of developmental order for pronouns in L2s (1994). Most of the research has been done in the first language and it has only been very recently that work has started on a second language. This may be due to the very complex and multi-faceted nature of SLVA and the fact that researchers may be more interested in the bigger picture of SLA rather than SLVA in particular. intention and so on. This does not mean that a word cannot be learned at one trial however. Very little work on connectionism has been done in second languages. This modular view says we have different forms of memory and storage and that these can be tested in certain ways to find out how our lexicons work. if a learner said 'do a crime' and was corrected. Notable exceptions are Schmidt's review (1990). Similarly.the things that we call a consciousness. Furthermore. 'grammatical' features of the word. at least given the current generation of PDP models. What can the model not demonstrate? The working mind. It is clear that there are levels of human processing for which PDP models may not be an appropriate level of analysis. It should be noted that this practice is not behaviourist in the sense that each items will need repeating many times and that is the only way to learn. though connectionist networks do not prevent this happening. Connectionism rests on the assumption that we learn by trial and error in successive steps. spelling. It is not generally accepted even by PDP researchers that a connectionist (=PDP) model can account for all areas of human cognition. What it does mean is that the interconnections may need reinforcing to strengthen the interconnections. Our processing of the input affects our future potential output in that the present knowledge store has been altered by new input and a new status quo is made until new input comes along to confirm the present state or lead us to review it again. The network or sub-networks making up the lexicon is ever changing and one could view it as never resting.

. However. 1992. It seems therefore that the issue of whether the current symbolic paradigm or connectionism is the one and only explanation misses the point. A network system.a symbol for that knowledge.g. These terms do not exist in purist (PDP) systems . Recent debates by Fodor and Pylyshyn (1988) and the Jacobs and Schumann (1992) v's Eubank and Gregg debate (1995) and a reviews by Bechtel and Abrahamsen (1991) and Morris (1989) underscore these differences. Both sides tend to see things in extreme terms . al. objects and so on. Elman (1991. how we can guess the meaning of words and so on. p. These computer models cannot sufficiently model human behaviour exactly and indeed sometimes generate very non-human responses. 1993. In the final stage the model strikes a balance between the two poles of regularity and irregularity and even overgeneralizes at times as children would do (e. Connectionist systems of vocabulary acquisition have many characteristics that are desirable in simulations of human cognition. Differences from symbolic processing. and how it can substitute for lack of knowledge. Neither side has produced evidence for this universality and clearly both have their limitations (see Cohen et al. Clearly much has been learned about the workings of memory in relation to vocabulary learning in a second language in cognitive terms (see Nation. Symbolic systems are context insensitive in that they are distinct from their environment. However. verbs. Connectionist models on the other hand begin the task at the other end of the continuum. and the second phase concentrating on rule-governed regularities. Symbolic systems. 1992. or semantic groups such as 'words for travel' and so on each having a label for the kind of knowledge stored . how we may store and retrieve lexical knowledge. There are parts of our cognitive apparatus which are open to inspection and are transparent in nature and empirically testable. PDP systems will accept that rules can be stored in a connectionist network. This view is the one currently coming into fashion (see Kempen. by comparison. and Pinker. they cannot be tested empirically at the micro level of cognition and we are left with computer simulations of learning. The transparent part of our cognitive system may operate at a higher level and would include what we know about memory and so on. Marcus et al. That is. (1984) for a review in this area. In symbolic systems word knowledge is couched in terms of parts of speech such as nouns. therefore are subject the fallacy that things can only be referred to in symbolic terms and therefore do not connect themselves to the real world.a universal take-it-all-or-leaveit-all view (Pinker and Prince 1988). in a sense it is unavailable to us and the interconnections are made automatically without our intervention. 1991). can deal with anomalies by adding further assumptions. such as memory span. lexical competence. (1993) for a review).This is not clear for second language learners however. how lexical knowledge is schematic or associative. if one views the connectionist / symbolic argument in terms of an non-universal answer then the situation changes somewhat and one can see things in terms of complementary rather then confrontationary stance. but it is unusual to find so many in one model. may arise out of the general properties of parallel distribution which operate without any reference to such rules. attention and so on. Symbolic systems such as the generative linguistic paradigm would account for linguistic knowledge in terms of nouns. These debates take place on the basis of accepting one view means the other is unacceptable. These models show the learning process over time. A connectionist account of lexical knowledge is good at describing the storehouse of vocabulary. an alien listening to us via radio signal might learn the sounds of the language but not the semantics unless they could observe a word's relationship with objects and the events to which it refers. That is. There are other parts of our cognitive system which are not open to inspection. It seems that the connectionist architecture could operate at a lower 'impenetrable' level of cognitive activity whereby we are not able to access it by introspection. purely syntactic).g. but they require additional apparatus to account for regularities which reflect the interaction of meaning with form and which are more contextually defined. how the words are connected through their associations. such as how we retrieve lexical information from our brain or how we process the auditory information and add it to our store. this is important as most studies in SLVA have been cross sectional in nature. it may be that aspects of human performance which appear so regular as to be conveniently summarized by rules (like the rules of grammar in a language). They emphasize the importance of context in the interaction of form with meaning'. This means that under a PDP paradigm. It is important to distinguish connectionism from a symbolic account of learning and knowledge storage. the symbolic system loses its causal role in cognition and is thus an unacceptable outcome to many linguists as a typical UG proponent would see these rules as essential to human linguistic processing. Many of these are found in other models of cognition. subjects. Summary. In addition the computer simulations take a long time to learn whereas humans can learn at one trial and new simulations need to be developed to account for these inadequacies. See Johnson-Laird et. This would lead to a two level interdependent model of vocabulary acquisition. symbolic systems have rules by which this information can be processed and rules which state what is impossible in a language. 1990 for a review) but they offer little in the way of insights into the micro-view of cognition which connectionism seems to explain quite well. It would make sense to have a two level hybrid system because the symbolic machine operates according to its own autonomous set of principles. feets instead of feet). but they are not the foundation stone on which the network is made. 221) says 'this insensitivity allows for the expression of generalizations which are fully regular at the highest level of representation (e. In the first phase the systems tend catch the irregularities by rote. for example graceful degradation automatic generalization and so on. Typically. Non-human behaviour Due to the very nature of these systems not being transparent.

Sign up to vote on this title
UsefulNot useful