Name
Size
michigan_tradesman.csv
365.2 KB
michigan_tradesman_metadata.zip
3 MB
michigan_tradesman_pdf.zip
33.1 GB
michigan_tradesman_text.zip
143.9 MB

Michigan Tradesman Dataset

Access

This dataset is publicly available.

Permanent Link

https://doi.org/10.25335/4gg5-y985

Description

The full text, PDFs, and metadata available as downloads above are derived from the The Michigan Tradesman collection available via the MSU Libraries' digital repository.


Data Summary

One zip file contains the full plain text text; another (much larger) file with all PDFs; and a third containing metadata in two formats: MODS and Dublin Core. The MODS data is more complete and acts as the primary record of each newspaper issue, while the Dublin Core data is less complete but also less hierarchical and perhaps easier to read or parse. The Dublin Core metadata has also been converted to csv.

The text, which was produced by OCR using tesseract, is uncorrected and of widely varying quality. Metadata has been applied at the issue (not article) level.


Contact Us

If you have any questions or suggestions concerning this data, please send them to the Digital Scholarship Lab.


Back to Datasets