Michigan Tradesman Dataset
Access
This dataset is publicly available.
Permanent Link
https://doi.org/10.25335/4gg5-y985
Description
The full text, PDFs, and metadata available as downloads above are derived from the The Michigan Tradesman collection available via the MSU Libraries' digital repository.
Data Summary
One zip file contains the full plain text text; another (much larger) file with all PDFs; and a third containing metadata in two formats: MODS and Dublin Core. The MODS data is more complete and acts as the primary record of each newspaper issue, while the Dublin Core data is less complete but also less hierarchical and perhaps easier to read or parse. The Dublin Core metadata has also been converted to csv.
The text, which was produced by OCR using tesseract, is uncorrected and of widely varying quality. Metadata has been applied at the issue (not article) level.
Contact Us
If you have any questions or suggestions concerning this data, please send them to the Digital Scholarship Lab.