Penn Parsed Corpora of Historical English

The Penn-Helsinki Parsed Corpus of Middle English, second edition (PPCME2), the Penn-Helsinki Parsed Corpus of Early Modern English (PPCEME) and the Penn Parsed Corpus of Modern British English are syntactically annotated corpora of prose text samples of English from the indicated time periods. Their syntactic annotation (parsing) permits searching, not only for words and word sequences, but also for syntactic structure. The corpora are designed for the use of students and scholars of the history of English, especially the historical syntax of the language, and they are publicly available under the following conditions:
 
The PPCME2 and the PPCEME are currently available on CD-ROM at a charge of $300 for a five-user license for both corpora. The PPCMBE is currently under construction and will be distributed for a modest additional charge when completed. Contact (Anthony Kroch) if site licenses for more than five users are desired. Funds paid for the corpora go toward improving them and increasing them in size. Updates are available at nominal cost to corpus license holders. Instructions for ordering the corpus CD-ROM are posted here.

The Penn Corpora are distributed with a search program CorpusSearch2, written by Beth Randall, and released as open source software. CorpusSearch is also available at its Sourceforge project web site.

  • The PPCME2 was created with the support of the National Science Foundation (Grants BNS89-19701 and SBR95-11368), with supplementary support from the University of Pennsylvania Research Foundation.
  • The PPCEME was created with the support of the National Endowment for the Humanities (Grant PA23382-99) and the National Science Foundation (Grant BCS99-05488).
  • The PPCMBE is being created with the support of the National Science Foundation (Grant BCS04-18061).

more...


Byland Abbey, Yorkshire. It was at abbeys like Byland, throughout Britain, that the manuscripts on which our knowledge of Middle English is based were largely written, copied and preserved. The monastic orders that built and inhabited these monasteries were dissolved by Henry the Eighth, whereupon the buildings were dismantled for building materials by the landlords who succeeded to the monastic estates. Most of the abbeys' manuscripts were lost, but some came into private hands and so survived. Photo © A. Kroch 1998.