The contributor of this material included the following information about processing she had done. The comments below in [ECI--...--ECI] indicate what has been done in converting the contributor's markup into TEI-compliant SGML: 1. Each file I'll be sending to you contains a complete edition of a certain Italian newspaper, i.e. all the texts having appeared in a certain paper at a specific date, e.g. La Stampa 20th October 1989. La Stampa is a Turin (Italy) based newspaper with above all regional but also national distribution. The editions constituting the material have been published the 20th and 21st October 1989. Most articles have been provided in electronic form (on floppy disk) by the publishers. Headlines and articles which had appeard in the paper version but were missing on the floppy disk have been transcribed by typing. Apart from captions, publicity, advertisments, notices, radio, television or cinema listings, share indexes etc. the electronic version of the paper corresponds to the published version. . . . 5. Rudimentary Lemmatization + decomposition of preposition and article, e.g. della > del+la, * decomposition of verb and clitic, e.g. presentarsi > presentar*si * decomposition of cioè > cio*è and of os*sia 6. elements of a periphrastic verb form which are separated, for example by an adverb, have been linked in the following way: è già stato detto > è [stato detto] già stato detto 7. the intrinsic differentiations of the medium 'newspaper', i.e. the textual elements have been coded using the COCOA-format: : newspaper name, e.g. [ECI--deleted --ECI] : the different editions of the papers -- same for every article in each file of the subcorpora e.g. [ECI--deleted --ECI] : the sections of the papers [ECI--converted to , n=Politica--ECI] : the authorship type of an article: : signed article : anonymous : editor : the name of the author in the case of a signed article e.g. [ECI--combined as Caprara Giovanni ( firmato ) --ECI] : the position of the text on the page e.g. : opening article on page 05 [ECI--If associated with an article, then: otherwise (a few cross-references) discarded--ECI] : the different text forms: a) the different types of headings: , button-hole , title , sub-title , summary door-chain [ECI--, etc.--ECI] : the different text forms: [ECI-- where we use either the English translation of the text form name as given below or the original if there is none--ECI] b) the text forms which correspond to a certain journalistic tradition such as , LeadingArticle , [back, shoulder] , [italics, running] , Citation Letter Reply , , WeatherReport , , InShort , , Agency [ECI--also sometimes appears on p--ECI] c) forms which are not so easily classified: , News , [ECI--deleted - always redundant--ECI] , Criticism , Interview [ECI--Elziviro and Ricetta each also occur once--ECI] [ECI--div3 (occasionally) and div2 (rarely) appear w/o any type where a div was required by the document structure but no definitive information could be found in the original markup as to what type it should be--ECI]

: sections of texts, denoted to represent a certain type of language:

,

,

, [ECI--... as appropriate--ECI]

,

, the last two appearing only in interviews. [ECI--

,

respectively--ECI] 9. Sometimes, i.e. not systematically, a code is used to denote words etc. in another language, e.g. for English words etc. [ECI----ECI] 10. Text in double parenthesis, i.e. (( )) has been defined as 'comment', e.g. ((sic)) [ECI--unclear when these are in the original and they've been added by E. Burr--ECI] 11. Special characters Tag Significance . full stop } decimal point (alt 250 under WIN WORD) [ECI--.--ECI] , comma \ comma in a number (alt 179 under WIN WORD) [ECI--,--ECI] + decomposition of compound 'preposition and article' e.g. 'della' becomes 'del+la' * decomposition of compound 'verb and clitic' e.g. 'presentarsi' becomes 'presentar*si' and 'ossia' becomes 'os*sia' font markers: @ Roman [ECI--deleted--ECI] ^ Italic [ECI--. . .rend=italic. . .--ECI] ~ Boldface [ECI--. . .rend=bold. . .--ECI] [ECI--attribute attached either to or or, if necessary, to an interpolated .... Where the FIRST letter (plus any preceding punctuation) of a unit was emboldened, rend=flbold was used, to avoid breaking words unnaturally.--ECI] others: = hyphen [ECI--soft - deleted and closed up--ECI] - dash or hyphen in compounds [ECI--But is also used 1) as punctuation, rarely doubled and 2) to mark embedded quotes. These usages usually, but NOT always, are signalled by spaces preceding and/or following the hyphen--ECI] ' apostrophe " " or << >> quotes in correspondence with published text.