CONJ | coordinating conjunction |
D | determiner |
FP | focus particle |
FW | foreign word |
NEG | negation |
NUM | cardinal number (except ONE) |
RP | adverbial particle |
XX | unknown POS |
ABOUT, ACROSS, BY, DOWN, IN, OFF, ON, OUT, OVER, THROUGH, TO, UP
and_CONJ rode_VBD on_RP more_QR than_P a_D paas_N sir_NPR Ector_NPR assayed_VBD to_TO pulle_VB oute_RP the_D swerd_N And_CONJ Sir_NPR Rauf_NPR of_P Beeston_NPR +gaf_VBD vp_RP the_D castel_N to_P the_D Kyng_N and_CONJ therwith_ADV+P he_PRO yelde_VBD up_RP the_D ghost_N
Items from the above list that immediately precede a preposition continue to be tagged as particles as long as they are spelled as separate words (notably IN TO and UP ON, but not INTO, UPON or ADOWN, APON, UNTO).
me_MAN droh_VBD hire_PRO +tus_ADV in_RP to_P dorkest_ADJS wan_N knelyng_VAG doun_RP oppon_P his_PRO$ knees_NS oute_RP of_P the_D byggest_ADJS castell_N doune_RP to_P the_D erthe_N +tai_PRO were_BED exilede_VAN oute_RP of_P Spaygne_NPR &_CONJ earnin_VB him_PRO crune_N up_RP o_P crune_N &_CONJ healden_VB hit_PRO se_ADVR wal_ADV hat_ADJ hehe_ADJ up_RP on_P hire_PRO$ heaued_N
When an item from the above list combines with -WARD, it continues to be tagged RP.
All cardinal numbers except for ONE
are tagged NUM, whether spelled out, in numeral form, or in
some combination of the two.
Compound numbers are treated
as written.
For AND in number sequences, see AND.
For numbers in foreign language sequences, see foreign words.
DOUBLE, TRIPLE, etc., TWICE, THRICE, and ONCE when analogous in meaning to TWICE, THRICE
are always tagged NUM.
DOZEN and SCORE are treated as cardinal numbers, on a par with
HUNDRED and THOUSAND, and tagged NUM.
When overtly marked for plural (DOZENS, SCORES, HUNDREDS, THOUSANDS,
etc.), cardinal numbers are tagged NS.
Cases where an ordinal might be expected but without overt ordinal
marking are treated as cardinal
numbers and tagged NUM.
Otherwise, ordinal
numbers are tagged ADJ.
Demonstratives are always tagged D, regardless of whether
they precede a noun. Note the difference between ordinary determiners
and wh- words in this regard.
For cases ambiguous between AN (D) and ONE (ONE),
the default is D.
Foreign names and certain common Latin liturgical
texts are treated as proper nouns.
Otherwise, everything (words, symbols, numbers, etc.) except
punctuation in foreign language sequences is labelled FW.
Foreign language titles are generally tagged FW.
MARY, MARRY and spelling variants are
tagged NPR at the word level, but surrounded by INTJP
brackets at the clausal level. See Interjection phrase (INTJP).
NO is tagged INTJ only when parallel to YES.
Items like FORSOOTH or wh- words are not tagged at the POS level as
INTJ, even when used as interjections. Their function is
sometimes indicated at the phrasal level; see Interjection phrase (INTJP).
When negation cliticizes to the beginning of verbs and modals, the
resulting combination is treated
as a compound.
When negation cliticizes to the end of modals, the resulting
combination is split.
Cardinal numbers except ONE (NUM)
This_D Joyous_ADJ trouth_N conteyneth_VBP in_P itself_PRO two_NUM
partyes_NS
+ter_EX schal_MD com_VB befor_ADV xv_NUM dayes_NS of_P gret_ADJ drede_N
and_CONJ all_D men_NS and_CONJ woymen_NS and_CONJ childyrne_NS schull_MD
aryse_VB vp_RP yn_P +te_D age_N of_P xxx=ti=_NUM +gere_N
double_NUM manere_N of_P money_N
+te_D sale_N of_P +tynges_NS was_BED of_P double_NUM price_N
and_CONJ twyse_NUM I_PRO smote_VBD hym_PRO downe_RP
his_PRO$ horse_N turned_VBD twyse_NUM abowte_RP
The_D ij._NUM day_N
the_D .ix._NUM chapytre_N
Complementizers (C)
THAT, +TE, and variants introducing any kind of subordinate clause are
tagged C.
See also
AS (complementizer),
AS, SO, THAN (preposition),
IF.
and_CONJ sei+t_VBP +tat_C it_PRO was_BED ano+ter_D+OTHER body_N
and_CONJ was_BED i-schore_VAN monk_N in_P an_D abbay_N
+tat_C he_PRO hym_PRO self_N bulde_VBD
dohter_N he_PRO cleope+d_VBP hire_PRO ._, for-+ti_P+D
+tt_C ha_PRO understonde_VBP ._, +tt_C he_PRO hire_PRO
luueliche_ADJ liues_N$ luue_N leare+d_VBP ._, as_P feader_N
ah_MD his_PRO$ dohter_N ._.
Al_Q swo_ADV he_PRO de+d_DOP +to_D men_NS +de_C sennen_NS
habbe+d_HVP forhaten_VBN te_TO laten_VB
Coordinating conjunctions (CONJ)
The following items are tagged CONJ when used as conjunctions.
In discontinuous conjunctions, both parts are tagged CONJ.
AND,
NE,
NOR,
OR,
ai+der_CONJ +ge_CONJ hodede_VAN +ge_CONJ leawede_ADJ
+ge_CONJ +de_D saule_N +ge_CONJ +de_D lichame_N
nai+der_CONJ ne_CONJ be_P heuene_NPR ne_CONJ be_P ier+de_NPR
Determiners (D)
The following words are tagged D when used as determiners:
A, AN, THAT, THE, THESE, THIS, THOSE, YON, YONDER
and_CONJ cristened_VBD hym_PRO at_P +te_D citee_N Dortik_NPR ,_, +tat_D
is_BEP Dorchestre_NPR
+tat_D is_BEP Friday_NPR
+Tis_D +tat_C is_BEP i-seide_VAN in_P +te_D comyn_ADJ table_N
For_FOR to_TO brynge_VB +tis_D aboute_RP Machometus_NPR norsched_VBD
and_CONJ fedde_VBD a_D faire_ADJ camel_N
Focus particles (FP)
The following words, all of which also have other uses, are tagged
FP when used as focus particles:
ALONE, BUT, EVEN, FORTH, ONE, ONLY, YET
Singuler_ADJ lufe_N es_BEP bot_FP of_P Jhesu_NPR Cryste_NPR alane_FP
+tan_ADV wil_MD +te_PRO liste_VB stele_VB by_P +te_PRO alane_FP
there_EX was_BED but_FP fewe_Q folk_NS at_P that_D tyme_N that_C
beleved_VBD perfitely_ADV
ye_PRO were_BED never_ADV but_FP my_PRO$ servaunte_N syn_P ye_PRO
resseyved_VBD the_D omayge_N of_P oure_PRO$ Lorde_N Jesu_NPR Cryste_NPR
they_PRO shold_MD come_VB by_P Crystmasse_NPR even_FP unto_P London_NPR
and_CONJ kut_VBD thorow_P the_D trappoure_N of_P stele_N and_CONJ the_D
horse_N evyn_FP in_P two_NUM pecis_NS
+tat_C hie_PRO ne_NEG biholden_VBP non_Q iuel_N ne_CONJ non_Q
un-nut_ADJ ne_CONJ for+den_FP idel_ADJ
hie_PRO bie+d_BEP ut-iworpen_VAN +durh_P dieules_NPR$ lare_N ,_,
naht_NEG for_P hem_PRO seluen_N ane_FP
+De_D mann_N ne_NEG leue+d_VBP naht_NEG $be_P {TEXT:he}_CODE
bread_N ane_FP
he_PRO axede_VBD no+ting_Q+N wi+t_P here_PRO ,_, but_P oneliche_FP
heir_PRO$ clo+ting_N and_CONJ oneliche_FP heir_PRO$ body_N
hwi_WADV wi+d_VBP21 dra+gest_VBP22 +tu_PRO +tin_PRO$ hont_N ._,
&_CONJ +get_FP +tin_PRO$ king_N hond_N of_P midde_N +tine_PRO$
bosme_N
Foreign words (FW)
libro_FW 5=o=_FW ,_, capitulo_FW 24=o=_FW ._.
In_P that_D Alcoranum_FW it_PRO is_BEP i-wrete_VAN
in_P the_D prologe_N on_P Regum_FW
iij._NUM stories_NS of_P the_D ij._NUM book_N of_P Paralypomynon_NPR
and_CONJ of_P Regum_FW
Interjections (INTJ)
INTJ is only used to tag words that are difficult or impossible
to tag any other way, like the following:
AH, ALAS, AMEN, AYE, FAREWELL, FIE,
GAR (< God), GRAMERCY, HA, LA, LO, NAY, NO, OH, PARDEE, POOF, WASSAIL, WELAWEI, YEA,
YES, WITECRIST
But_CONJ +te_D chanouns_NS of_P Dorchestre_NPR sei+t_VBP nay_INTJ
"_" Nay_INTJ ,_, "_" quo+t_VBD +te_D aungel_N ,_.
+Te_D kyng_N and_CONJ his_PRO$ fautoures_NS seide_VBD "_" +gis_INTJ al_Q
at_P +te_D fulle_ADJ ._. "_"
Loo_INTJ ,_, "_" quod_VBD Mahometus_NPR
Oo_INTJ Kyng_N of_P bliss_N
Negation (NEG)
The negative particles NE and NOT are tagged NEG, as are NO and
NONE in WHETHER OR NOT clauses. NE is also
used as a coordinating conjunction
(CONJ), and NOT is also used as
a quantifier (Q).
non_Q senne_N ne_NEG mai_MD bien_BE idon_DAN bute_P +durh_P
unhersumnesse_N
for_CONJ I_PRO wille_MD not_NEG be_BE long_ADJ behynde_ADV
hit_PRO ne_NEG derue+d_VBP ham_PRO nawt_NEG
me_MAN ne_NEG net_VBP me_PRO noht_NEG te_TO forsweri+gen_VB
wheither_WQ it_PRO oghte_MD nedes_ADV be_BE doon_DAN or_CONJ noon_NEG
wheither_WQ he_PRO wol_MD doon_DO or_CONJ no_NEG
Unknown POS (XX)
Words with unknown POS are tagged XX.
(NP-OB1 (NUM C) (N myle)
(XX li))