The contributor of this material included the following information
about processing she had done.  The comments below in [ECI--...--ECI]
indicate what has been done in converting the contributor's markup
into TEI-compliant SGML:

1. Each file I'll be sending to you contains a complete edition of a
certain Italian newspaper, i.e. all the texts having appeared in a
certain paper at a specific date, e.g. La Stampa 20th October 1989.

La Stampa is a Turin (Italy) based newspaper with above all regional
but also national distribution.  The editions constituting the
material have been published the 20th and 21st October 1989.  Most
articles have been provided in electronic form (on floppy disk) by the
publishers.  Headlines and articles which had appeard in the paper
version but were missing on the floppy disk have been transcribed by
typing.  Apart from captions, publicity, advertisments, notices,
radio, television or cinema listings, share indexes etc. the
electronic version of the paper corresponds to the published version.

. . .

5. Rudimentary Lemmatization

+ 	decomposition of preposition and article, e.g. della > del+la,
* 	decomposition of verb and clitic, e.g. presentarsi > presentar*si
* 	decomposition of cioè > cio*è and of os*sia

6. elements of a periphrastic verb form which are separated, for
example by an adverb, have been linked in the following way:

	è già stato detto > è [stato detto] già stato detto

7. the intrinsic differentiations of the medium 'newspaper', i.e. the
textual elements have been coded using the COCOA-format:

<Z>: newspaper name, e.g. <Z Stampa>			[ECI--deleted --ECI]

<E>: the different editions of the papers -- same for every article
     in each file of the subcorpora e.g. <E 191089>	[ECI--deleted --ECI]

<S>: the sections of the papers		[ECI--converted to <div1 type=section
     e.g. <S Politica>, <S Cultura>				n=Politica--ECI]

<A>: the authorship type of an article:		
	<A firmato>: signed article
	<A non firmato>: anonymous
	<A Redazione>: editor
<N>: the name of the author in the case of a signed article
     e.g. <N Caprara Giovanni>
	[ECI--combined as
	<docAuthor>
	Caprara Giovanni
	( firmato )
	</docAuthor>--ECI]

<C>: the position of the text on the page
     e.g. <C AP05>: opening article on page 05
    [ECI--If associated with an article, then:
	  <div2 type=article rend="pageNo:AP05">
          otherwise (a few cross-references) discarded--ECI]
<T>: the different text forms:
  a) the different types of headings:
     	<T Occhiello>, 		button-hole
	<T Titolo>, 		title
	<T Sottotitolo>,	sub-title 
	<T Sommario>, 		summary
	<T Catenaccio>		door-chain
  [ECI--<head type=Occhiello>, etc.--ECI]

<T>: the different text forms:	[ECI--<div3 type=textform>
				 where we use either the English translation
				 of the text form name as
				 given below or the original if there is
				 none--ECI]

  b) the text forms which correspond to a certain journalistic
     tradition such as 
	<T Fondo>, 		LeadingArticle
	<T Spalla>, 		[back, shoulder]
	<T Corsivo>,		[italics, running]
	<T Citazione>,		Citation
  	<T Lettera> 		Letter
   	<T Risposta> 		Reply
    	<T Rubrica>, 		
	<T Tempo>, 		WeatherReport
	<T Film>, 		
	<T Breve>,		InShort
     	<T Flash>, 
	<T Agenzia>, 		Agency
	<T Riassunto>		[ECI--also sometimes appears on p--ECI]

  c) forms which are not so easily classified:
   	<T Notizia>, 		News
	<T Articolo>, 		[ECI--deleted - always redundant--ECI]
	<T Critica>, 		Criticism
	<T Partita>,
	<T Intervista>		Interview

  [ECI--Elziviro and Ricetta each also occur once--ECI]

  [ECI--div3 (occasionally) and div2 (rarely) appear w/o any type where
        a div was required by the document structure but no definitive
        information could be found in the original markup as to what type
	it should be--ECI]

<P>: sections of texts, denoted to represent a certain type of language:
     <P Discorso>, <P Prosa>, <P Citazione>,	[ECI--<q>...</q> as
							appropriate--ECI]
     <P Domanda>, <P Risposta>, the last two appearing only in interviews.
		[ECI--<p type=Question>, <p type=Reply> respectively--ECI]

9. Sometimes, i.e. not systematically, a code <G> is used to denote
words etc. in another language, e.g. <G Engl> for English words etc.
		[ECI--<foreign lang=eng>--ECI]

10. Text in double parenthesis, i.e. (( )) has been defined as
'comment', e.g. ((sic)) [ECI--unclear when these are in the original and
				they've been added by E. Burr--ECI]

11. Special characters
Tag		Significance 

.		full stop
}		decimal point (alt 250 under WIN WORD) [ECI--.--ECI]
,		comma
\		comma in a number (alt 179 under WIN WORD) [ECI--,--ECI]
+		decomposition of compound 'preposition and article'
		e.g. 'della' becomes 'del+la'
*		decomposition of compound 'verb and clitic'
		e.g. 'presentarsi' becomes 'presentar*si'
		and 'ossia' becomes 'os*sia'

font markers:
@		Roman		[ECI--deleted--ECI]
^		Italic		[ECI--. . .rend=italic. . .--ECI]
~		Boldface	[ECI--. . .rend=bold. . .--ECI]
	[ECI--attribute attached either to <divn...> or <head...> or, if
		necessary, to an interpolated <hi>...</hi>.
		Where the FIRST letter (plus any preceding punctuation) of
		a unit was emboldened, rend=flbold was used, to avoid breaking
		words unnaturally.--ECI]
others:
=			hyphen	[ECI--soft - deleted and closed up--ECI]
-			dash or hyphen in compounds
	[ECI--But is also used 1) as punctuation, rarely doubled and
			       2) to mark embedded quotes.
         These usages usually, but NOT always, are signalled by spaces
	 preceding and/or following the hyphen--ECI]
'			apostrophe
"  " or << >>	quotes in correspondence with published text.
