$ | possessive marker |
EX | existential THERE |
MAN | indefinite pronoun MA (Middle English) |
N | common noun, singular |
N$ | common noun, singular, possessive |
NPR | proper noun, singular |
NPR$ | proper noun, singular, possessive |
NPRS | proper noun, plural |
NPRS$ | proper noun, plural, possessive |
NS | common noun, plural |
NS$ | common noun, plural, possessive |
OTHER | OTHER, nominal use, singular |
OTHER$ | OTHER, nominal use, singular, possessive |
OTHERS | OTHER, nominal use, plural |
OTHERS$ | OTHER, nominal use, plural, possessive |
whe+ter_WQ +tere_EX were_BED mo_QR of_P his_PRO$ predecessours_NS in_P paradys_N o+ter_CONJ in_P helle_N and_CONJ +tere_EX were_BED i-seie_VAN wonder_ADV false_ADJ si+gtes_NS and_CONJ fals_ADJ tokenes_NS
for_P +dan_D +de_C me_MAN nett_VBD hem_PRO to_P +dan_D a+de_N ._. +Teih_P me_MAN niede_VBP me_PRO to_P +dan_D a+de_N ,_, me_MAN ne_NEG net_VBP me_PRO noht_NEG te_TO forsweri+gen_VB For_CONJ +tar_ADV man_MAN ne_NEG can_VBP his_PRO$ mu+des_N$ me+de_N +Dis_D word_N credo_FW man_MAN mai_MD understonden_VB on_P +tre_NUM wise_NS ._.
all_Q +te_D host_N +tat_C cam_VBD with_P +te_D king_N were_BED robbid_VAN be_P northen_ADJ men_NS
In connection with proper nouns (SOUTHERN CROSS), the same principles apply to these adjectives as to ordinary adjectives.
Otherwise, compass points are tagged N, whether used alone or with another noun.
Fro_P <font>_CODE Cathay_NPR </font>_CODE go_VBP men_NS toward_P the_D est_N be_P many_Q iorneyes_NS if_P we_PRO gone_VBP toward_P +te_D north_N Thomas_NPR Grey_NPR ,_, a_D knyte_N of_P +te_D north_N and_CONJ ano+tere_D+OTHER fram_P +te_D North_N into_P +te_D South_N ,_, +tat_C was_BED callede_VAN Ikenyle_NPR strete_NPR +de_D nor+d_N half_N +te_D nor+t_N hille_N +te_D nor+tside_N+N
For the treatment of NORTH, SOUTH, EAST, WEST in connection with
proper nouns, see Proper nouns, especially
the exceptions to the principle of avoiding redundant use of NPR and the
discussion of unique entities.
Units of measure (DAY, POUND, YEAR, etc.)
Units of measure after numbers (TEN YEAR, etc.) are tagged as singular
or plural depending on overt number marking. Forms in -s, -a, or
-en are marked as plural, and all others as singular.
xl_NUM daies_NS
sex_NUM monthis_NS
vii_NUM +gere_N .xxx._NUM +gera_NS
.xx._NUM yeres_NS
three_NUM hondred_NUM wynter_N ueale_Q hund_NUM wintra_NS
ix_NUM c_NUM pound_N
Possessive and genitive nouns (N$, NS$, $)
Common nouns standing in a possessive/genitive relationship with other
nouns are tagged N$, NS$. As with the plural, genitive marking
in early texts predates universal -S. In these cases, N$, NS$
indicates genitive/possessive function rather than any particular
morphological form. Conversely, morphologically genitive nouns that do
not stand in a relationship with some other noun are not tagged
N$, NS$; see NP adverbs
for examples.
+te_D mannes_N$ shrifte_N the man's shrift +te_D sowle_N$ fode_N the soul's food his_PRO$ sinne_N$ sore_N sorrow of his sin +te_D apostles_NS$ mu+des_NS the apostles' mouths +ter_PRO$ apostlene_NS$ lore_N the apostles' teaching kinges_NS$ sunes_NS kings' sons alre_Q kinge_NS$ king_N king of all kings
The $ tag generally appears directly only on nouns and
pronouns (N, NS, NPR, NPRS, PRO, WPRO), indicating their
relationship with other nouns. However, in the absence of an overt noun
in a possessive NP to carry the possessive marker, the $ tag
can appear directly on other nominal categories, yielding NUM$,
OTHER$, OTHERS$, Q$. A common case in Middle English is the ALRE plus superlative
The dollar tag ($)
In addition to appearing directly on other tags (see last paragraph),
the $ tag can also appear alone. It is always used alone for
HIS in the JOHN HIS BOOK construction, and it is sometimes so used for
the possessive clitic ('S, S), which postdates the texts in the PPCME2,
although it appears occasionally in the edited texts. When the
possessive clitic is spelled as a separate word, as it sometimes is, it
always receives a tag of its own. When spelled together with the
preceding word, it is treated like a genitive ending if attached to a
noun (N, NS, NPR, NPRS) and not preceded by an apostrophe (THE
BODMIN'S HAT), it is split off with an emendation. See Genitive/possessive modifiers of N
for the structures corresponding to the following examples.
the_D Lord_N of_P Bodmins_NPR$ hat_N the_D Lord_N of_P $Bodmin_NPR $'s_$ {TEXT:Bodmin's}_CODE hat_N the_D Lord_N of_P Bodmin_NRR s_$ hat_N the_D Lord_N of_P Bodmin_NPR 's$ hat_N God_NPR $almighty_ADJ $s_$ {TEXT:almightys}_CODE mercy_N < emendation because ADJ, not N God_NPR $almighty_ADJ $'s_$ {TEXT:almighty's}_CODE mercy_N God_NPR almighty_ADJ s_$ mercy_N God_NPR almighty_ADJ 's_$ mercy_N
Many inconsistencies and outright errors likely remain with respect to the tagging of proper nouns. |
Avoid redundant use of NPR. In general, when combining with words that are proper nouns on their own, words that aren't proper nouns on their own are given their ordinary tag. For instance, prepositions are not treated as part of proper nouns except when they are not spelled separately or in the case of foreign names.
(NP (NPR Gy) (PP (P of) (NP (NPR Marchia)))) (NP (NPR seint) (NPR Patrik) (PP (P of) (NP (NPR Ireland)))) (NP (NPR Berwick) but: (NP (NPR Stratford-upon-Avon)) (PP (P upon) (NP (NPR Tweed))))
Systematic exceptions to this principle occur in connection with Maximize internal structure.
In general, our annotation maximizes the internal structure of noun
phrases that contain proper nouns. Particularly noteworthy is the case
of potential appositive structures. Following THE or possessive pronouns,
noun-noun pairs are always treated as appositive structures, although
this is almost certainly the wrong analysis in some cases.
In such cases, the "name" part is tagged NPR even if it is
not a noun.
This principle has the following exception:
Foreign names. Foreign names are tagged as proper nouns
(NPR) rather than as foreign words (FW).
In contrast to closed-class items in English names, closed-class items
in foreign names (DE, DU, LE, LA, etc.) are always treated as part of
the name.
(NP (NPR Saint) (NPR John)
(NP-PRN (N Baptist)))
(NP (NPR David) (NP (NPR Iohannes)
(NP-PRN (D the) (N prophet))) (NP-PRN (D +de) (N godspellere)))
(NP (PRO$ my) (N cousin) (NP (PRO$ my) (N lorde)
(NP-PRN (NPR Roper))) (NP-PRN (NPR Arthure)))
(NP (D the) (N kynge) (NP (D the) (ADJ grete) (N Lady)
(NP-PRN (NPR Royns)) (NP-PRN (NPR Lyle))
(PP (P of) (PP (P of)
(NP (NPR Northe) (NPR Walis)))) (NP (NPR Avilion))))
(NP (D +te) (ADJ gentil) (N Erl)
(NP-PRN (NPR Thomas)))
(NP (D the) (N virgin) (NP (D the) (VAN blessed) (N virgin)
(NP-PRN (NPR Mary))) (NP-PRN (NPR Mary)))
(NP (D the) (N Castell) (NP (D the) (N castell)
(NP-PRN (NPR Aungel))) (NP-PRN (NPR Nygurmous)))
(NP (D the) (N flum) (NP (D the) (N water)
(NP-PRN (NPR Iordan))) (NP-PRN (NPR Ponte)))
(NP (D the) (N Castell) (NP (D the) (N Sege)
(NP-PRN (NPR Terrable))) (NP-PRN (NPR Perelous)))
(NP (PRO$ our) (NPR Lord) (NPR God))
(NP (NPR Sankgreall)
(NP (NPR Nova) (NPR Scotia))
Plural marking.
As with units of measure, plural tags
are used only on items with explicit plural marking.
(NP (NPR Petir) (NPR de) (NPR Luna))
(NP (NPR Melyot) (NPR de) (NPR Logyrs))
(NP (NPR Sagramour) (NPR le) (NPR Desyrus))
(NP (D the) (NPR West) (NPRS Saxons))
Words that cannot bear plural marking (ENGLISH, FRENCH) are tagged
as adjectives, not as proper nouns. See Groups of people for discussion.
Proper nouns by form
Bare nouns
Bare nouns denoting offices (ARCHBISHOP, EARL, JUSTICE, KING, POPE, PROTECTOR) are not proper nouns on their own. |
Bare nouns that are names are proper nouns on their own. These include:
Days of the weekSunday, Monday, Tuesday, Wednesday, Thursday, Friday, SaturdayHolidays and special days
Christmas, Easter, Lammas, Michaelmas, Pentecost, Whitsunday, etc.Months of the year
January, February, March, April, May, June, July, August, September, October, November, DecemberPersons (including pagan gods)
Elizabeth, Henry, Pericles, Tully, etc.
Athena, Artemis, Jupiter, Mars, Venus, etc.Places
Athens, England, London, Paris, Rome, etc.Unique entities (including GOD, DEVIL, and their epithets)
In adjective-noun pairs, if the head noun is a proper name on its own, then the adjective is tagged ADJ (in keeping with the principle of avoiding redundant use of NPR).
(NP (ADJ Good) (NPR Friday)) days (NP (ADJ Holy) (NPR Saturday) (NP (ADJ Bloody) (NPR Mary)) persons (NP (ADJ New) (NPR Troye)) places (NP (ADJ holy) (NPR cherche)) unique entities (NP (ADJ almighty) (NPR God) (NP (NPR god) (ADJP (ADJ almihtin))) (NP (NPR Lord) (ADJP (ADJ Almighty))) (NP (ADJ holy) (NPR scripture))
If the head noun is not a proper noun on its own, then the adjective is tagged NPR along with the noun.
(NP (D the) (NPR Holy) (NPR Lond)) places (NP (D the) (NPR North) (NPR Pole)) (NP (D the) (NPR rede) (NPR see)) (NP (D the) (NPR Southern) (NPR Cross)) unique entities (NP (D the) (NPR Holy) (NPR Ghost)) (NP (D the) (NPR Round) (NPR Table)) (NP (D the) (NPR Old) (NPR Testament)) (NP (NPR Holy) (NPR Writ))
Specific epithets associated with a specific person are not treated like offices. If such an epithet is used without the person's name to refer to that person, the epithet is tagged NPR.
(NP (D the) (NPR Baptist)) (referring to John) (NP (D the) (NPR Conqueror)) (referring to William, etc.) (NP (D the) (NPR Ironside)) (referring to Edmund) (NP (D the) (NPR virgin)) (referring to Mary)
Epithets used with a person's name (JOHN BAPTIST, EDMUND IRONSIDES),
on the other hand, are given appositive
structures (in keeping with the principle of maximizing internal
D/PRO$ + N + NPR
Instances of the type THE EARL THOMAS, MILORD CROMWELL are always
treated as appositive structures (in keeping
with the principle of maximizing
internal structure).
Instances of the type OUR LORD GOD are exceptions to the principle of maximizing internal structure.
They are treated as flat strings.
(NP (PRO$ Oure) (NPR Lorde) (NPR Godd)) (NP (PRO$ Oure) (NPR Lorde) (NPR Jhesu) (NPR Crist))
In general, in phrases of the type THE N OF NP, the first noun is tagged N. See CITY, TOWER for two special cases.
(NP (D the) (N Abbay) (PP (P of) (NP (NPR King) (NPR Edward)))) (NP (D the) (N Castell) (PP (P of) (NP (NPR Four) (NPRS Stonys)))) (NP (D the) (N cherch) (PP (P of) (NP (NPR Chestir)))) (NP (D the) (N cite`) (PP (P of) (NP (NPR Camelot)))) (NP (D the) (N covent) (PP (P of) (NP (NPR Coventre)))) (NP (D the) (N tropic) (PP (P of) (NP (NPR Cancer))))
Any nouns within the PP are tagged NPR only if they are proper nouns on their own.
(NP (D the) (N feste) (PP (P of) (NP (NPR Pentecoste)))) <-- name of day (NP (D the) (N feste) (PP (P of) (NP (NPR Ascension)))) <-- named event (NP (D the) (N tropic) (PP (P of) (NP (NPR Cancer)))) <-- unique entity
Any nouns within the PP that are not proper nouns on their own are tagged with their ordinary tags, not with NPR.
(NP (D +te) (N feste) (PP (P of) (NP (D +te) (N camel)))) (NP (D +te) (N day) (PP (P of) (NP (N doom))))
Note the counterintuitive result in cases where the nouns in an THE N OF NP construction are common nouns, but the phrase as a whole refers to a unique entity (THE WAR OF THE ROSES). This issue awaits resolution. |
In noun-noun pairs where neither of the nouns is a proper noun on its own, both parts are tagged N.
(NP (N Sir) (N Knight)) (NP (N lord) (N emperour)) (NP (N Mr.) (N Attorney)) (NP (N Mr.) (N Speaker))
Otherwise, both parts are tagged NPR.
(NP (NPR Julius) (NPR Caesar)) (NP (NPR Jhesu) (NPR Crist)) (NP (NPR Robin) (NPR Hood))
This is true even in cases where one of the nouns is not a proper noun on its own (see examples below). Such cases are exceptions to the principle of avoiding redundant use of NPR.
For cases in which one of the nouns denotes an office (KING HENRY, LORD CHIEF JUSTICE SCROPE), see Offices.
(NP (NPR Lady) (NPR Lisle)) <-- title etc. exceptionally tagged NPR (NP (NPR mrs.) (NPR Lisle)) (NP (NPR seynt) (NPR Gregory)) (NP (NPR sire) (NPR Thomas)) (NP (D the) (NPR West) (NPRS Saxons)) (NP (NPR Penteney) (NPR Abbey)) (NP (NPR London) (NPR Brigge)) (NP (NPR London) (NPR town)) (NP (NPR North) (NPR Galys)) (NP (NPR Sussex) (NPR County)) (NP (NPR Easter) (NPR day)) (NP (NPR Lammas) (NPR term)) (NP (NPR Maundy) (NPR Thursday)) <-- MAUNDY on its own = N
(NP (NP-POS (NPR Seint) (NPR$ Edward)) <-- NPR$ by function (NPR day)) <-- DAY exceptionally tagged NPR
Words referring to groups of people (ethnic, ideological, or religious) are handled as follows. If the word has no plural form, it is tagged ADJ.
English_ADJ French_ADJ Jewish_ADJ Scottish_ADJ Spanish_ADJ
If the word has a plural form, it is tagged NPR(S).
Jew_NPR Jews_NPRS Spaniard_NPR Spaniards_NPRS
ENGLISHMAN, FRENCHMAN, and other compounds with -MAN are treated as written.
Many words referring to groups of people are systematically ambiguous between a nominal and an adjectival use. Our solution to this ambiguity is as follows. If the ambiguous word is overtly marked for plural or if it occurs in a syntactic context where an overtly marked plural would be possible, it is tagged NPR(S). Otherwise, the word is tagged ADJ.
He is a Catholic_NPR. analogously: Christian, German, Protestant They are Catholics_NPRS. They are Catholic_ADJ. the Catholic_ADJ church_NPR
the_D Englissh_ADJ tonge_N our native language is English_NPR the langage of English_NPR the_D Latin_ADJ bible_NPR to study Latyn_NPR
(NP (D +te) (NPR Assumpcioun)) (NP (D +te) (NPR incarnacion)) (NP (D the) (NPR Passion)) (NP (D the) (NPR Resurreccion))
(NP (D the) (N King)) (NP (D the) (N Pope)) (NP (D the) (ADJ Prime) (N Minister)) (NP (N Lord) (ADJ Chief) (N Justice))))) (NP (PRO$ my) (N Lord) (NP-PRN (ADJ Chief) (N Justice))) <-- NP-PRN because of possessive
In conjunction with a proper noun (KING HENRY, LADY LISLE), they are tagged NPR. These cases form a systematic exception to the principle of avoiding redundant use of NPR.
(NP (NPR kynge) (NPR Arthure)) (NP (NPR Pope) (NPR John) (NPR Paul))
The same distinction is also made in syntactically more complex cases, notably in ones where the expression denoting the office contains an adjective. When the expression occurs on its own, any adjectives are tagged ADJ (with an accompanying ADJP if postnominal).
(NP (N Attorney) (ADJP (ADJ General))) (NP (D the) (N Lord) (ADJ Chief) (N Justice))))) ( (IP-MAT (NP-SBJ-1 (PRO he)) (BED was) (VAN appointed) (IP-SMC (NP-SBJ-1 *T*) (NP-OB1 (N Lord) (ADJ High) (N Admiral)))))
But when the office occurs in conjunction with a name (that is, when it functions as a title), then any adjectives are tagged NPR and the entire NP is given a flat structure.
(NP (NPR Attorney) (NPR General) (NPR Brown)) (NP (NPR Lord) (NPR Chief) (NPR Justice) (NPR Scrope)) (NP (NPR Lord) (NPR High) (NPR Admiral) (NPR Calvert))
Unique is taken in a strict sense. Nouns like the following are not necessarily proper nouns on their own, although they can be tagged NPR under the right conditions. See also(NP (D the) (NPR Bible)) (NP (NPR Excalibur)) (NP (NPR Scripture)) <-- SCRIPTURE counts as NPR
In general, book titles are not treated as proper nouns, as this would go against our principle of maximizing internal structure. The apparent exceptions BIBLE and SCRIPTURE are proper nouns on their own.CITY, GRAIL, MOON, SUN, TESTAMENT, TOWER, WRIT
CHRISTENDOM in locative sense is tagged NPR. In the sense of CHRISTIAN FAITH, it is tagged N.
Hie_PRO is_BEP anginn_N of_P alle_Q cristendome_NPR ,_. +Dre_NUM +ting_NS ben_BEP +tat_C elch_Q man_N habben_HV mot_MD ._, +te_C wile_MD his_PRO$ cristendom_N leden_VB ._. +De_D rihte_ADJ bileue_N setten_VBP +te_D twolue_NUM apostles_NS on_P write_N ;_, ar_P hie_PRO ferden_VBD in_RP to_P al_Q middeneard_NPR to_TO bodien_VB cristendome_N ._.
CHURCH in an institutional sense is tagged NPR.
(NP (D the) (ADJ catholic) (NPR church)) (NP (D the) (NPR church) (PP (P of) (NP (NPR England))))
Names and epithets of the DEVIL (FIEND, SATAN, UNWIHT, WURSE, etc.)
are always proper.
Names and epithets of the Judeo-Christian GOD
(CREATOR, LORD, etc.) are always proper. This includes the TRINITY,
its members (FATHER, SON, HOLY GHOST), and
relevant epithets (CHRIST, HEALER, SAVIOR). LADY as epithet for Mary is
tagged NPR. In doubtful cases, the default is N. For
examples of the type OUR LORD GOD, see D +
Certain common Latin liturgical texts are treated as proper nouns.
ZODIAC and the signs of the zodiac are treated as proper nouns; GEMINI
and PISCES are treated as singular.
(NP (PRO$ Oure) (NPR Father)) (NP (PRO$ ure) (NPR helende))
(NP (PRO$ Oure) (NPR Lady))
(NP (NPR Lord)) (NP (PRO$ Oure) (NPR Lord))
(NP (D the) (NPR Trinity)) (NP (NPR +trumnesse))
(NP (NPR Lord) (NPR Iesu))
(NP (NPR Ave) (NPR Maria))
(NP (NPR Credo)
(NP (NPR Pater) (NPR Noster))
(NP (NPR Requiem))
(NP (NPR Te) (NPR Deum) (NPR Laudamus))
Pronouns (PRO, PRO$)
All pronouns are tagged PRO except pronominal MAN (also ME) and pronominal ONE.
Possessive pronouns
Possessive pronouns are tagged PRO$ whether or not they modify a noun.
Note the difference between PRO$ and WD/WPRO in this regard.
hys_PRO$ son_N
thy_PRO$ baptym_N
the_D lyon_N was_BED nat_NEG myne_PRO$
and_CONJ therefore_ADV+P ye_PRO shall_MD loose_VB youres_PRO$ !_. '_'
Reflexive pronouns
Reflexive forms (MYSELF, YOURSELF, etc.) when spelled together are
tagged PRO+N or PRO$+N. In the ambiguous case of
HERSELF, PRO is the default. SELF is always tagged N
regardless of number.
(NP (PRO$+N myself))) (NP (PRO me) (N self)))
(NP (PRO+N hymself)) (NP (PRO hym) (N self))
(NP (PRO+N herself)) <-- PRO by default
(NP (PRO$+N yourselues)) (NP (PRO$ your) (N selues))