You are on page 1of 11

APLING 629: Structure of the English Language

Data Driven
Learning
Prof. Charles Meyer

Apostolos Koutropoulos
5/12/2009
Data Driven Learning Overview
Data Driven Learning derives from the work of Tim Johns when he suggested

that instructors should use corpora in language learning classrooms. (Braun, 2007)

Data driven learning is the "application of tools (concordancers) and techniques

from corpus linguistics in the service of language learning." (Payne, 2008) The

benefits of data driven learning is that the focus is on "the exploitation of authentic

materials even when dealing with tasks such as the acquisition of grammatical

structures and lexical items [...], on real, exploratory tasks and activities rather than

traditional «drill & kill» exercises, [...] on learner-centred activities," and on "the use

and exploitation of tools rather than ready-made or off-the-shelf learnware."

(Rüschoff)

Data driven learning differs from traditional grammar learning in a few ways.

For starters, the pedagogical approach to teaching grammar the traditional way is

through a process of presentation of information on the teacher's part, then the

students practice with this information, and then the students produce new content.

In contrast, in Data driven learning students observe a grammatical phenomenon of

the language, they hypothesize as to how this grammatical phenomenon works, and

then they experiment to see if their hypothesis is correct (Payne, 2008)

In traditional language and grammar learning, the teacher is the driver and

the students the passengers, while in data driven learning the teacher is more of a

co-pilot and navigator and the students are able to sit in the driver's seat and take

control of their learning. Because of this difference in the pedagogic approach, the

2
materials used are also different in a data driven learning classroom compared to

traditional classrooms.

In a traditional classroom the main companions to instruction are textbooks.

The textbooks and the traditional approach to grammar learning "divides up

grammar in an (sic.) system that ignores the nature of English and of authentic

communication using English." (Byrd, 1997) This of course poses a few issues that

are outlined by Byrd, such as the inconsistency in defining what is easy grammar

versus hard grammar, the ability to cover all the material in a given curriculum, and

using authentic materials in the language classroom (1997).

Data driven learning solves some of these issues by not relying on textbooks,

but rather relying on corpora, "a body of text assembled according to explicit

design criteria for a specific purpose," (Payne, 2008) concordancing programs and

keyword-in-text (KWIC). By using corpora, you are using authentic text from the

target language both in your instruction of that language and the grammar of that

language. Thus you are exposing students to material that they are likely to come

across as users of the language.

Resource Evaluation
Running a quick Google search one does not find a lot of resources pertaining

to language teaching using a data driven learning methodology. Looking a little

deeper I found a number of journals, such as ReCALL, where I was able to find more

information about data driven learning in practice. What I found interesting was that

data driven learning was the focus of experiments in classrooms in Europe, Asia and

the Middle East, however I did not find any articles where the focus of the

3
experiment was an ESL classroom in the United States where corpora are available

and primary materials such as video and audio files are also easily available.

The resources that I found fall into two categories. The first category is the

aforementioned research where researchers in EFL and ESL conduct studies in

classrooms to see how data driven learning approaches work compared to

traditional learning methodologies when teaching a certain topic. The second

category, a much smaller category as I mentioned, is that of learning resources for

language teachers and examples of data driven learning activities.

Passapong Sripicharn's website is one resources that has materials that are

"designed to draw the learners' attention to certain vocabulary and linguistics

features by providing the students with concordance data and guiding them to

make a generalisation." (2005) He does this by first introducing students to

concordancing and the keyword-in-context (KWIC) format in lessons one and two.

Following lessons provide the students with, what seem to be, handpicked

sentences and KWIC examples that are used to illustrate some feature of the

language. Some examples of the types of exercises used are: deducing the

meaning of a word, putting sentences in order so that they make sense, fill-in-the-

blanks, and determining relationships between different grammatical structures.

Another example of resources is David Lee’s list of resources and miscellaneous

links for tools and examples of teaching using corpora. (2007)

The examples that Passapong Sripicharn uses do present some issues

because they require interpretation and the online format does not allow for

detailed analysis and synthesis of information by the student. It provides absolute

4
interpretations which may or may not be true. For example, in Unit 251 there is a

concordance using KWIC with hectic being the keyword. In this example there are

five sentences using hectic and the students are asked to pick the word which has

the closest meaning to hectic. If you click on what the author intended you to click

you get a pop-up window that says correct and if you guess wrong you get a pop-up

window that says try again. The available synonym options are boring, busy, bad,

and lively. The answer marked as correct is busy, however I could also see lively

and bad as being viable options. From personal use I know that I've used hectic to

mean bad quite a few times. If things are busy, I just say that it was a busy week.

There are also good examples of using collocation to determine the meaning

of a word or an expression. In Unit 302 for instance, Passapong Sripicharn provides

us with KWIC text for the expressions on the road to and on the brink of. Based on

the collocating words that the students observe in these examples they then are

asked to pick the right expression to fill in the blank in an exercise. The relationship

that we see is that on the road to is followed by something positive and on the brink

of is followed by something negative. Of course this has limitations, which a teacher

must account for, such as someone using on the road to hell, or on the road to

disaster which obviously are not positive.

Usage in my Classroom
The major question about data driven learning is whether or not I would be

able to implement it in a language classroom in which I taught, either an ESL/EFL

classroom or a classroom where I would be teaching Greek. Based on the

information I found in my searches for both materials and ideas to use in the
1
http://www.geocities.com/tonypgnews/ddl_25.htm
2
http://www.geocities.com/tonypgnews/ddl_30.htm

5
classroom and research articles I would say that I would use data driven learning,

but I would not replace a whole curriculum with data driven learning. Instead I

would use data driven learning techniques to either teach specific topics, or I would

use it as a type of exercise for the students to use in the process of learning the

target language.

Since the "DDL approach suggests that grammar learning should consist

largely of consciousness-raising activities rather than the teaching of rules.”

(Mansour & Ali Akbar, 2006) I would have to see where I could best fit such an

approach. In addition, the other reason why I wouldn't go all in with data driven

learning is because all of the literature seems to indicate that there is no clear cut

proof that data driven learning is superior to any other methods of teaching.

(Boulton, 2009) (Braun, 2007) (Mansour & Ali Akbar, 2006)

One of the factors that I would need to consider before I implemented data

driven learning would be class size and required resources for data driven learning

activities. If the class size is too large, it may not be possible to conduct data driven

learning exercises in class, and as such it might be better off as a homework

exercise. As homework other factors need to be taken into account such as if the

students have computers at home, if they've got access to the corpora that you

want to use and if they've got access to the tools to do a KWIC analysis.

The second factor that I would have to consider is the expectations of the

students. The predominant belief seems to be that grammar learning is associated

with learning a set of rules, and if such an element is lacking in the classroom, the

students might feel like they have not learned something. In Hadley (Hadley, 2002)

for instance we see that "Kerr (1993) found in his survey of 100 teacher trainees

6
that attitudes toward grammar ranged from viewing it as an abstract set of rules, to

expressing feelings of terror. Similar sentiments are found in Chalker (1994), who

notes that many classroom teachers equate grammar with the acquisition of some

set of rules -- rules that are at times contradictory and at other times confusing."

In Braun (Braun, 2007) we see that several students felt that they hadn't

learned any grammar because they did not write down any grammar rules. Braun

notes that "such statements reflect prevailing perceptions about learning: it is still

seen as something that happens only if, or as soon as, something is being written

down." (Braun, 2007) I think that in order for data driven learning to gain

acceptance from the students the teacher needs to do two things. The first thing is

for the teacher to explain to the students that experiential learning will not only

help them deduce the rules but will also help them in learn how to analyze text

when they are in situations where they don’t know something in a text and they

don’t have the benefit of having someone with them that can explain it. The second

thing that teachers should do is to provide a summary of what the students have

learned at the end of each exercise and tie that in with other established rules. This

way the teacher helps arrange the rules that the students have synthesized through

their analysis of corpora through data driven learning, and the students who feel

that they haven’t learned anything because they didn’t write down any rules can

rest easy because they can now write down a few rules.

One final factor to decide is whether or not to use a full corpus, such as the

corpus of contemporary American English3 and go full speed ahead and let the

students do their own concordances, or whether to filter the material and provide

the students with printed out concordances such as the ones provided on

3
http://www.americancorpus.org/

7
Passapong Sripicharn’s (2005) website. I think this would depend both on the level

of the students in the classroom, the technology limitations, the classroom makeup,

and whether or not I would want my students to focus on a specific corpus. For

instance I may want to focus on blog language for a few lessons to illustrate a few

points. I could develop a corpus on my own for that set of lessons and hand it out to

students to use with their concordancers. Some corpora on the other hand may

require subscriptions so it may be better to provide printouts of specific

concordances.

I think that in the end it comes down to knowing your students. As Hadley

(2001) writes about problems with data driven learning, students can become

demotivated if they get too much on their plate in terms of data, and they might not

be able to have sufficient material to analyze if they don't get enough data. The

concordancing materials might be at a level beyond what the student is

conformable with, so students aren't scaffolded, but rather are being asked to leap

and hope they can grab on to the level that the materials are on.

At this point Hadley points out that the teacher is stuck between a rock and a

hard place. The teacher can "simplify the concordance material and lessen its

authenticity," (Hadley, 2001)like what Passapong Sripicharn (2005) did for his

website, "or maintain the authenticity and risk demotivating some students because

of the difficulty of the material." (Hadley, 2001)

Personally I think that if curricula incorporate data driven learning from early

ages in language learning, it's perfectly OK to choose simpler and less authentic

material. As the students mature, you can scaffold them onto more challenging and

more authentic corpora in data driven learning exercises. In an environment, such

8
as an ESL classroom, where you might find mixed levels of background knowledge

and analytical approaches, data driven learning can still be used. In this instance

the teacher needs to do an assessment of the student's skills and prepare material

as needed for each student. This way each student will be performing at the level

that they feel comfortable with, while at the same time scaffolding to a more

advanced level.

9
Bibliography
Boulton, A. (2009). Testing the limits of data-driven learning: language proficiency
and training. (F. Blin, & J. Thompson, Eds.) ReCALL: the journal of EUROCALL.
, 21 (1), 37-54.

Braun, S. (2007, September 6). Beyond Data-Driven Learning: Learning activities for
a spoken multimedia corpus. Retrieved April 30, 2009, from European Youth
Language: http://www.um.es/sacodeyl/data/conferences/eurocall2007/Beyond
%20Data-Driven%20Learning_eurocall2007_sb.ppt

Braun, S. (2007). Integrating Corpus Work into Secondary Education: From Data-
Driven Learning to Needs-Driven Corpora. (F. Blin, & J. Thompson, Eds.)
ReCALL: the journal of EUROCALL. , 19 (3), 307-328.

Byrd, P. (1997, December). Grammar FROM Context: Re-thinking The Teaching Of


Grammar At Various Proficiency Levels. The Language Teacher Online , 21
(12).

Cobb, T., Greaves, C., & Horst, M. (2001). Can the rate of lexical acquisition from
reading be increased? An experiment in reading French with a suite of on-line
resources. In P. Raymond, & C. Cornaire, Regards sur la didactique des
langues secondes. (pp. 133-153). Montreal, QC, Canada: Éditions logique.

Hadley, G. (2001). Concordancing in Japanese TEFL: Unlocking the power of data-


driven learning. In K. Gray, M. Ansell, S. Cardew, & M. Leedham, The
Japanese Learner: Context, Culture and Classroom Practice (pp. 138-144).
Oxford, UK: Oxford University Press.

Hadley, G. (2002). Sensing the Winds of Change: An Introduction to Data-Driven


Learning. RELC Journal , 22 (2), 99-124.

Infante, P. (2009, April). Explicit Grammar Instruction: Theory & Research. Retrieved
April 30, 2009, from Applied Linguistics Student Association:
http://alsaclub.ning.com/forum/attachment/download?
id=2643024%3AUploadedFi38%3A1481

John's, T. (2000, August 1). Retrieved April 30, 2009, from Tim John's Data-Driven
Learning Page:
http://www.ecml.at/projects/voll/our_resources/graz_2002/ddrivenlrning/whati
sddl/resources/tim_ddl_learning_page.htm

Lamy, M.-N., Klarskov Mortensen, H. J., & Davies, G. (2009). ICT4LT Module 2.4:
Using concordance programs in the Modern Foreign Languages classroom.

10
Retrieved April 30, 2009, from Information and Communication Technology
for Language Teachers: http://www.ict4lt.org/en/en_mod2-4.htm

Lee, D. (2007). Teaching & Misc. Links. Retrieved April 30, 2009, from David Lee's
Bookmarks for corpus-based linguistics: http://devoted.to/corpora

Mansour, K., & Ali Akbar, J. (2006). Data-driven Learning and Teaching collocation of
prepositions: The Case of Iranian EFL Adult Learners. (J. Jung, & P. Robertson,
Eds.) Asian EFL Journal , 8 (4), 192-209.

Mukherjee, J. (2005). Data Driven Learning. Retrieved April 30, 2009, from Anglistik
Language Centre Giessen: http://http://www.uni-
giessen.de/anglistik/ling/ALC/ddlintro.html

Passapong, S. (2005). My DDL Materials. Retrieved April 30, 2009, from Tony's DDL:
http://geocities.com/tonypgnews/units_index_pilot.htm

Payne, J. S. (2008, June 8). Data-Driven South Asian Language Learning. Retrieved
April 30, 2009, from The University of Chicago South Asian Language
Resource Center:
http://salrc.uchicago.edu/workshops/sponsored/061005/DDL.ppt

Rüschoff, B. (n.d.). Data-Driven Learning (DDL): the idea. Retrieved April 30, 2009,
from
http://www.ecml.at/projects/voll/rationale_and_help/booklets/resources/menu
_booklet_ddl.htm

Tian, S. (2005). Data-Driven Learning: Do Learning Task and Proficiency Make a


Difference? Proceedings of the 9th Conference of the Pan-Pacific Assocition of
Applied Linguistics (pp. 360-372). Tokyo, Japan: Waseda University Media Mix
Corp.

Truscott, J. (1998). Noticing in second language acquisition: A critical review.


Second Language Research , 14 (2), 103-135.

Tuttle, H. (2009, April 25). Empowering Teachers to be Data Driven Decision


Makers. Retrieved April 30, 2009, from
archive.techlearning.com/techlearning/pdf/events/techforum/chi05/Tuttle_Em
powering_D3M_TFCH05.pdf

11

You might also like