8 KWIC concordance

There is a complete KWIC concordance based on the tagged LOB Corpus. The concordance was available on tape and microfiche (Hofland and Johansson 1986),but is no more available

8.1 Tapes and files

Code:

ASCII

Tracks

9

Density

1600 or 6250

Label:

none

Parity:

odd

Files:

95 (see the index in 8.6)

EOF marks

1 after each file, 2 at end of tape

Record size:

132

Blocking factor:

60

8.2 Records

Column

Contents:

1-7

Reference number (see 2.5)

8

Blank

9-61

Lefthand context

62-132

Keyword + righthand context

8.3 Sorting

In the original LOB Corpus concordance (Hofland and Johansson 1979) the arrangement is by graphic word, and the sorting under each key word is alphabetical according to the righthand context. In the new concordance the occurrences under each graphic word are sorted in the first instance by tag and, under each tag, according to the context: tag of key word +1, spelling of key word +1, tag of key word +2, spelling of key word +2. From key word +3 the sorting is by spelling only. The sorting sequence among the characters is according to ascending Ascii order: (!"#$%&'()* +,-./0-9:;<=>?@A-Z[\]^-'{|}, no separation between upper and lower case). Punctuation marks and other delimiters like quotes, parentheses etc are skipped.

The concordance starts with the letter A. Graphic words with a non-alphabetical initial (including enclitics like 'll and 's) are placed behind Z, at the end of the concordance. Forms marked with the abbreviation code (\0) are placed at the end of the entry for the relevant graphic word.

8.4 Example(obsolete).

8.5 Frequencies

At the beginning of each graphic word there is a frequency survey giving the following information: (1) total frequency of each tag found with the word, (2) relative proportion of each tag (expressed in %), (3) relative frequency of each tag (expressed in words per million), and (4) absolute and relative frequencies of each tag in the individual text categories. See the example above.

8.6 Index to the KWIC concordance (tapes)

File
no.

First word in file

1

Introduction

2

a_&FW

3

a'_ABN

4

advance_JJB

5

all-age_JJB

6

an'_CC

7

and/or_CC

8

area_NN

9

Asaph_NP

10

attitudinising_NN

11

B_ZZ

12

Be*?2gue*?2_NP

13

believing_VBG

14

Borodin's_NP$

15

Butagas_NP

16

Canaanites_NNPS

17

character-actors_NNS

18

coalition_NN

19

conducting_NN

20

Coulomb_NP

21

d-ZZ

22

departs_VBZ

23

discusses_VBZ

24

draped_JJ

25

einen_&FW

26

estate-bottled_JJ

27

Expresso_NP

28

Feltri_NP

29

fluttering_NN

30

forage_NN

31

from-abroad_JJB

32

g_ZZ

33

Go"sta_NP

34

Guinevere_NP

35

hard-core_JJB

36

he's_NN$

37

high-backed_JJ

38

histamine_NN

39

I_&FW

40

Ifield_NP

41

in*:2**:_NNU

42

interviewed_VBD

43

Isa_NP

44

it's_NN$

45

killer_NN

46

L_NC

47

leggings_NNS

48

living-room_NN

49

Madrassi_NP

50

Matthew_NP

51

mticrobe_NN

52

more-than-life-size_ JJB

53

mycological_JJ

54

news-hounds_NNS

55

not-necessary_JJB

56

O_&FW

57

off_IN

58

on-the-spot_JJB

59

ora_&FW

60

overestimate_VB

61

pay-off_NN

62

placid_JJ

63

precipice_NN

64

prosperous_JJ

65

R-ZZ

66

referee_NN

67

resurgent_JJ

68

rustled_VBD

69

scientifically_RB

70

set-back_NN

71

shoulda_MD

72

smallest_JJT

73

souls_NNS

74

steel-and-glass_JJB

75

such-and-such-ABL

76

t_PP3

77

Thangue_NP

78

thatch_NN

79

the*?2a*?5tre_&FW

80

thereabouts_RB

81

Thistle_NP

82

tiara_NN

83

TO"lz_NP

84

triumphal_JJ

85

U_ZZ

86

US-initiated_JJ

87

voting_NN

88

Wasdale_NP

89

werewolves_NNS

90

whichever_WDT

91

wing_NN

92

women's_NNS$

93

years'_NNS$

94

#_&FO

95

193,174-CD