Check cond.htm for the condition of usage and credit.htm for a list of the compilers of the different corpora and the authors of the enclosed programs.
Also visit http://www.hit.uib.no/icame/cd for up-to-date information on the use of the CD-ROM. From this address it is also be possible to search some of the ICAME corpora with your Web-browser by using the WordSmith name and code as user id and password.
The available manuals are enclosed on the CD-ROM.
The CD-ROM contains 20 different corpora with more than 17 million words.
The disc is in the ISO 9660 format and is therefore readable on a wide range of computer systems (DOS, Windows, Macintosh and Unix). The enclosed software is for Windows/DOS, apart from Qwick, which may be used on machines with a Java runtime system.
The disc have the following directories:
Lexa |
LEXA programs, written by Raymond Hickey, Essen University |
Lingfont |
Linguafont programs, written by Raymod Hickey, Essen University |
Manuals |
The available manuals for the corpora and the software |
Qwick |
Qwick program developed at Birmingham University, used on FLOB corpus |
Tact |
TACT program from University of Toronto, used on the COLT corpus |
Texts |
The corpora in their original formats |
WC |
WordCruncher retrieval program and indexed versions of most of the corpora |
Wsmith |
Wordsmith program written by Mike Scott at Liverpool University |
The TEXTS directory has the following subdirectories:
ACE |
Australian Corpus of English (written) |
Brown1 |
Brown Corpus, format 1 (written) |
Brown2 |
Brown Corpus, format 2 (written) |
Browntag |
Brown Corpus, tagged version (written) |
CEECS |
Corpus of Early English Correspondence Sampler (written) |
COLT |
Corpus of London Teenage Language (spoken) |
FLOB |
Freiburg-LOB Corpus of British English (written) |
Frown |
Freiburg-Brown Corpus of American English (written) |
Helsinki |
Helsinki Corpus of English Texts, Diachronic part (written) |
ICE_EA |
International Corpus of English, East-African component (written/spoken) |
Innsbruc |
Innsbruck Computer-Archive of Machine-Readable English Texts (ICAMET) |
Kolhapur |
Kolhapur Corpus of Indian English (written) |
Lampeter |
Lampeter Corpus of Early Modern English Tracts (written) |
LLC |
London-Lund Corpus (spoken) |
LOB |
Lancaster-Bergen-Oslo Corpus (written) |
LOBTAG |
Lancaster-Bergen-Oslo Corpus, tagged version (written) |
Newdigat |
Newdigate Newsletters (written) |
Old_Scot |
Helsinki Corpus of Older Scots (written) |
POW |
Polytechnic of Wales Corpus (spoken) |
SEC |
Lancaster/IBM Spoken English Corpus |
WC |
Wellington Corpus of Written New Zealand English |
WSC |
Wellington Corpus of Spoken New Zealand English |
Some of the directories have futher sub-directories.
The program is installed to the hard disk by running Setup.Exe in the WSmith directory.
The program is installed in the directory C:\WSMITH by default, this name may be changed by the user.
After the program has been installed on the hard disk, start WSHELL.EXE to use the program. Update the demo version to the full version by choosing Adjust Settings and Update from Demo.
When "Updating from Demo", please type in the details EXACTLY as you see them inside the CD-ROM cover. The first (8 letters or numbers) is the "Name", the longer one is the "Registration". You can put any other information in "Other Details" if you wish. Please see the "readme.txt" file for any further details.
WordSmith can be used with the texts in the TEXTS directory. It may also be used with texts in the WC directory (files with extention .BYB).
The WordSmith manual is found in the WSMITH directory as a Word document (manual.doc) or as an Acrobat file in the MANUALS directory (wsmanual.pdf) If you do not have this version of Acrobat you can install version 4.0 for Win95/98/NT by running the file AR40ENG.EXE in the MANUALS directory on the CD-ROM. Other versions of Acrobat are found at Adobe
WordSmith is commercial software and you are only permitted to run the software on the number of computers you have bought licences for.
For support on using the WordSmith program with ICAME texts, contact Knut.Hofland@hit.uib.no.
The retrieval component of WordCruncher (for DOS) is installed by opening a MS-DOS promt/window from the Start menu. Select the letter for the CD-ROM (usualy D:) and write COPY_WC. The program files are then copied to C:\WCS and the program is started in the "menu mode". To finish the program, press SHIFT+F10. The next time you want to start the ICAME CD, click WCVLTD.EXE in the C:\WCS directory (or make a shortcut to this file from your desktop).
WordCruncher can also be run in "bookshelf mode". Change to this mode by running NOMENUCD.BAT, if you want to go back to "menu mode", choose MENUCD.BAT. Some of the options of WordCruncher are not available in the "menu mode" (CONCORD option from Main Menu and the Frequency Distribution from the reference list), use "bookshelf mode" to access these.
Will this installation all the corpora are used from the CD-ROM. For improved speed when working with the texts, copy the relevant corpora (all the files in a directory) to your hard disk or network disk. In "bookshelf mode", press "Insert" to put these corpora on the bookshelf.
A scanned version of the "Learning WCView" manual is included on the CD-ROM. In "menu mode" there is a short introduction to the features of the WCView program.
You are allowed to install WordCruncher on an unlimited numbers of computers.
Install TACT to your hard disk by running INSTALL.EXE from the TACT directory. After TACT has been installed, copy the Colt textbases (COLT*.TDB) from TACT\TACT214 to the place where TACT was installed (usually C:\TACT214).
To work with the COLT files, start USEBASE.EXE from the directory C:\TACT214 and press return to choose a suitable database name:
COLTORT1 orthographic version, everything is indexed
COLTORT2 orthographic version, text in <> is not indexed
COLT_TAG tagged version, tag is connected to word with undescore (word_tag)
COLTPRO1 prosodic version (only 150 of 377files), everything is indexed
Press the Spacebar to access the menu line, select a display (from the Displays menu) and then search from the Select menu. F1 gives (context sensitive) help and F10 exits the program.
To install Qwick with the pre-indexed FLOB corpus, you first have to install the Java runtime for the platform you are working on. For Windows 95/98/NT run Java_win.exe in the QWICK directory on the CD-ROM. Please make a note of where this program is installed. Then unzip the file Qwick.zip to the directory C:\QWICK. If you do not have WinZip, use the demo copy included and install this from WinZip70.exe.
If you are running a non-English version of Windows, you have to edit the file QWICK.BAT in C:\QWICK (see comments in this file).
Start Qwick by running QWIC.BAT
For documentation click the file index.html in the qwick-1.0\doc directory
or visit the Qwick site at University of Birmingham
To copy the Lexa programs to your hard disk (C:\LEXA), run the file LEXA\COPYLEXA.BAT or use Windows Explorer to copy the directory LEXA with sub-directories. For documentation see RTF-files in the LEXA\DOCUMENT directory.
Use Windows Explorer to copy the directory LINGFONT to your hard disk.
Use LTEXT from the LEXA programs to view the documentation in the LINGFONT\DOCUMENT directory.
Questions to the ICAME CD-ROM and the software can be directed to
Knut Hofland
HIT Centre
University of Bergen
Allegt. 27
N-5007 Bergen
Norway
Tel. +47 5558 9463
Fax. +47 5558 9470
E-mail: Knut.Hofland@hit.uib.no