<I>

  <&>Wellington Corpus of Spoken New Zealand English Version One</&>
  <&>Copyright 1998 School of Linguistics & Applied Language Studies</&>
  <&>Victoria University of Wellington</&>

  <&>side one</&>
  <&>0:15</&>
  

  <WSC#MUL029:0005:QR>
      <O>clears throat</O> <.>o</.> <.>o</.> okay well <.>we</.> we're
      going to look at um <,,> word families which i call word
      families because it's er within the first <,> one thousand words
      of english and so rather than call it anything else so <.>that</.>
      that's the main thing

  <WSC#MUL029:0010:QR>
      now <.>the</.> want to start off by looking at why <,> why study
      word families <,> and so that gives the motivation for the study
      why <.>j</.> jerry and i have got together i've got jerry
      together with me and then to try and see what's the reasons for
      looking at that and to show you that there's <.>really</.> the
      really the decision about what a word is in terms of <,> either
      just <&>1:00</&> a form or its members really has quite far
      reaching effects when we look at <.>v</.> <.>v</.> vocab size
      and we look at language teaching and language learning <,>

  <WSC#MUL029:0015:QR>
      now the first thing is word family decisions affect measures of
      vocabulary size so if <.>we</.> if we want to ask the question
      how large are people's vocabularies and <,> <O>inhales</O> and i
      like to ask that question <&>1:24</&> <&>five seconds of
      laughter</&> <&>1:29</&>

  <WSC#MUL029:0020:QR>
      then it's <.>really</.> the ANSWER really depends on what do you
      mean by a word? because if you say um you know a word includes
      um only <.>th</.> only the inflectional forms of <.>that</.> of
      the stem plus the inflectional forms then you get quite a
      different answer to that question than if you include um <O>voc</O>
      derivational forms as well

  <WSC#MUL029:0025:QR>
      just take for example the first one thousand words of english

  <WSC#MUL029:0030:QR>
      if you take the <.>first</.> the <&>2:00</&> most frequent one
      thousand word STEMS of english and then you add the inflectional
      forms to this and then you add a very limited range of
      derivational forms then you end up with over four thousand forms
      for the first one thousand words of english and so in terms of
      vocab size you can see just for that what the effect of it could
      be

  <WSC#MUL029:0035:QR>
      then it's looking <,> not only at people's vocabularies but
      looking at the size of a text

  <WSC#MUL029:0040:QR>
      if you say well i want to read this text i'm not a native
      speaker of english how many words do i need to read this text so
      how many words are there in this text

  <WSC#MUL029:0045:QR>
      well then you have that same <,> issue being important again

  <WSC#MUL029:0050:QR>
      how many words does a dictionary contain <O>laughs</O>

  <WSC#MUL029:0055:QR>
      that's that's such an interesting question that the er i've been
      told that the oxford dictionary people leave it up to their
      publicity people to <others laugh>determine <unclear>word</unclear></others laugh>
      NO <&>3:00</&>

  <WSC#MUL029:0060:QR>
      THAT'S TRUE <,>

  <WSC#MUL029:0065:QR>
      <.>r</.> robert burchfield told me that and so the number of
      words in the shorter oxford dictionary er <.>the</.> what's on
      the blurb disagrees with what's in the introduction by about a
      hundred per cent so but <.>tha</.> but <.>still</.> so that's a
      part of it as well and then the next thing which relates vocab
      size to learning and teaching is what are feasible goals for a
      second language learning vocabulary development programme so if
      you're saying well we're going to have these people for x amount
      of time and we want to do some teaching

  <WSC#MUL029:0070:QR>
      now in terms of vocab <O>swallows</O> what are the goals of the
      programme

  <WSC#MUL029:0075:QR>
      well it's gonna make a big difference whether you decide you're
      teaching them one thousand words or four thousand words if it's
      that sort of thing

  <WSC#MUL029:0080:QR>
      so <.>c</.> continuing with the teaching and learning one then
      what's the unit <,> of teaching of vocab teaching and vocab
      learning

  <WSC#MUL029:0085:QR>
      so if we say we're gonna learn a word or we're gonna TEACH a
      word or start teaching a word then what's the unit that we're
      dealing <&>4:00</&> with <,,>

  <WSC#MUL029:0090:QR>
      what needs to be TAUGHT

  <WSC#MUL029:0095:QR>
      do we teach words as items or <.>are</.> are we <.>do</.> should
      we really be giving a lot of attention to the system which lies
      behind word building <O>inhales</O> <O>clears throat</O>

  <WSC#MUL029:0100:QR>
      so if we're teaching items then we have to teach all the
      derivatives and everything as separate items or do we say well
      if we teach the stem then their knowledge of the system will
      take care of all of these other ones therefore we should make
      sure that they have a good knowledge of the system <,> <O>tut</O>

  <WSC#MUL029:0105:QR>
      how do people's vocabularies grow <quietly>okay</quietly>

  <WSC#MUL029:0110:QR>
      so what's the role of word building in the growth of an
      individual's vocabulary

  <WSC#MUL029:0115:QR>
      so if <,> if er a young child's vocabulary is expanding and
      expanding how is it expanding

  <WSC#MUL029:0120:QR>
      is it expanding substantially through the addition of new <O>swallows</O>
      stems <.>is</.> or is it expanding substantially through the
      addition of new derivatives of known stems and so on <.>so</.>
      <&>5:00</&> so the question of what's a word relates to that and
      then you get on to even more abstruse issues of how's vocabulary
      stored and retrieved in the brain so <O>inhales</O> <exhales>er</exhales>
      there's an interesting piece of research which looks at what's
      the best indicator of speed of retrieval of an item

  <WSC#MUL029:0125:QR>
      is it the frequency of a form or is it the frequency of the
      combined members of the word family and the research indicates
      that actually if you add the frequency of the members of the
      word family together that gives you a better predictor of the
      speed at which someone can retrieve <.>an</.> <O>voc</O> any
      member of that family from their brain than the actual <.>spee</.>
      the frequency of the form itself so there is <.>some</.> some
      evidence but we'll come on to that a little bit later

  <WSC#MUL029:0130:QR>
      so that's sort of the motivation which lies behind it in a
      general way and in the sort of <.>metholo</.> methodological way
      there's people doing research on vocab size and on teaching and
      learning vocab all through the world but they don't agree on
      what THEY call a word and what <&>6:00</&> what's a word family
      and so on and so the idea is that if we can start to define this
      then the <.>an</.> there'll be no one answer so <.>we're</.>
      looking at a system of levels and you can say well i'm working
      with the definition of a word at level six <quietly>and so that
      includes all of these things</quietly> and then we might be able
      to get some agreement between the different people working on
      this and we can start to add some of the things together <O>tut</O>

  <WSC#MUL029:0135:QR>
      another example of the effect of this is some of you might know
      the <?>thorndyke and lorge</?> teacher's wordbook of thirty
      thousand words <,>

  <WSC#MUL029:0140:QR>
      okay

  <WSC#MUL029:0145:QR>
      if you take another definition of what's a word family and you
      include a few derivative forms in that then the <.>t</.>
      teacher's wordbook of thirty thousand words becomes a teacher's
      wordbook of THIRTEEN thousand six hundred words because simply
      by adding a few simple word building suffixes and prefixes you
      reduce the size of that list by half <quietly>more than half</quietly>
      <,> <O>tut</O>

  <WSC#MUL029:0150:QR>
      okay so that's that

  <WSC#MUL029:0155:QR>
      any comments on that before we move on to the next one what
      knowledge do we need to know <quietly>to <,> make use of word
      families</quietly> <&>7:00</&> <&>one minute and sixteen seconds
      of questions from audience and replies by QR and others not
      transcribed</&> <&>8:16</&>

  <WSC#MUL029:0160:QR>
      <O>clears throat</O> now <,> in order to make use of this idea
      of a word family then <.>th</.> there <.>ar</.> there are things
      <.>that</.> that <.>are</.> a learner needs to know

  <WSC#MUL029:0165:QR>
      first thing is <.>nee</.> you need to know the stems

  <WSC#MUL029:0170:QR>
      okay that's pretty obvious

  <WSC#MUL029:0175:QR>
      er the research in this area has been looking at <.>b</.> using
      learners age or grade level in school in the american studies
      <.>to</.> as a sort of <O>inhales</O> an index for this but
      clearly with second language learners you'd need to have some
      other way of looking at it and one way which someone might look
      at is by considering what's their vocabulary size as measured on
      a vocab size test and then try and relate this to to er
      knowledge of stems and and then relate <&>9:00</&> that to
      knowledge of er the derivatives and so on of the stems

  <WSC#MUL029:0180:QR>
      so that's the first thing

  <WSC#MUL029:0185:QR>
      the second thing is in order to make use of this knowledge you
      have <.>t</.> the learner would have to be able to recognise
      known stems in words okay and then also to some degree not
      recognise stems in words where they WEREN'T occurring and um
      there's <O>voc</O> also <.>research</.> research on that and er
      i think in the research they call this relational knowledge and
      there's a indication that learners <O>voc</O> native speakers
      get quite adept at being able to do this by about the fourth
      grade which <.>is</.> what's that that's about nine years nine
      ten years old or something like that <,>

  <WSC#MUL029:0190:QR>
      the third type of knowledge is to be able to recognise
      contribution of affixes <.>to</.> that they make when they're
      added to a stem <,> and <,> there <.>are</.> there's two aspects
      of this

  <WSC#MUL029:0195:QR>
      there's a sort of <&>10:00</&> <.>s</.> <.>some</.> a some
      affixes add a lexical element so i think something like less and
      ful add something of a lexical element to it <.>but</.> with the
      suffixes and then you have a syntactic <,> change often
      occurring with the addition of a um suffix and then the fourth
      thing which is something that in <&>pronounced inz</&> the study
      we're engaged in we're not terribly interested in cos we're
      really looking at it <.>f</.> from the point of view of
      receptive knowledge and that's to be able to produce allowable
      stem affix combinations and so that's really another way of
      saying it's what they call distributional knowledge in the
      research and that's sort of sort of say well if you have ness
      what sort of <.>word</.> what sort of stem can you add ness to
      and that's another kind of knowledge <,> <O>inhales</O> <O>tut</O>

  <WSC#MUL029:0200:QR>
      okay so <.>that's</.> so <,> that's er the sorts of knowledge
      needed

  <WSC#MUL029:0205:QR>
      now the next um thing is then what we wanted to do was to see
      can we take the <,> prefixes and <&>11:00</&> suffixes of
      english and then by applying <,> criteria divide them into
      stages or levels starting from <,> the most conservative and
      then moving up to things which are very ambitious <O>inhales</O>
      and jerry has done <,> all the hard work on this

  <WSC#MUL029:0210:QR>
      all i've done is <.>be</.> <,,> er to check it against some
      other work done by thorndyke in nineteen forty one and to er <O>laughs</O>
      try and destroy what he's done by cutting out a couple of the
      levels cos he had ten <O>laughs</O>

  <WSC#MUL029:0215:QR>
      i thought ten seemed too good to be true so we reduced it to
      eight <&>11:35</&> <&>three seconds of laughter</&> <&>11:38</&>

  <WSC#MUL029:0220:QR>
      now the criteria then we <.>qui</.> quickly flash through those
      are looking at frequency so that if we want a <,> we want to
      look at er <,,> we want to make sure that that if we want to
      include something in a in a <.>very</.> <quickly>what do we call
      it</quickly> shall we call one a high or two a high level yep if
      we want to put something in a high level then it's <{1><[1>something
      which occurs <&>12:00</&> <.>ver</.></[1>

  <WSC#MUL029:0225:QR>
      is it

  <WSC#MUL029:0230:QR>
      okay <{2><[2>then we'll call it a low level then</[2>

  <WSC#MUL029:0235:QR>
      <others laugh>we <.>want</.> we we <.>want</.> we want to
      include</others laugh> it in er <,,> if we want to make sure it
      occurs in in lots of <.>it</.> in lots of words and so <.>it</.>
      it's something which is is common so that's a sort of a <.>fr</.>
      a FREquency criteria and then you have er sort of criteria which
      are related to regularity and <.>s</.> and systematicity i guess
      and so highly productive is a sort <.>of</.> the overlap between
      frequency and regularity i guess <,>

  <WSC#MUL029:0240:QR>
      meaning is predictable once the category of the base is known
      okay <.>that's</.> that's regularity of the system again <,>

  <WSC#MUL029:0245:QR>
      move the affix <&>pronounced affesk</&> leave the base <&>pronounced
      beith</&> orthographically intact <O>inhales</O> okay <,> then
      phonologically intact and so on and i think the criteria are
      roughly in order of importance too

  <WSC#MUL029:0250:QR>
      is that right? <X>

  <WSC#MUL029:0255:XX>
      <[1><unclear>word</unclear> low level elsewhere</[1></{1>

  <WSC#MUL029:0260:XX>
      <[2><O>laughs</O></[2></{2></X> <&>13:00</&> <X>

  <WSC#MUL029:0270:XX>
      i think that's right</X>

  <WSC#MUL029:0275:QR>
      yeah roughly in order of importance and so by applying these
      criteria then we try <.>and</.> we we think we've come to these
      levels which are over here

  <WSC#MUL029:0280:QR>
      you've got levels one to eight <,> okay <&>13:13</&> <&>two
      minutes and fourteen seconds of questions and discussion by
      members of audience not transcribed</&> <&>15:27</&>

  <WSC#MUL029:0285:QR>
      <quietly>okay <,,><&>3</&> good</quietly> so we look at the
      levels then

  <WSC#MUL029:0290:QR>
      so level one each form is a different word <,> okay

  <WSC#MUL029:0295:QR>
      i put capitalisation is ignored because in one of the word
      frequency studies the carol davison richmond study actually
      capitalisation even made a different word in their definition of
      what was a word so a word completely written in capitals was
      different from the word which had the first letter as a capital
      which was different from the word written in lower case and so
      on <&>16:00</&>
</I>
