Standardize 1880
This step of the pipeline focuses on making the 1880 special schedule data consistent with earlier years by standardizing how products and materials are recorded. Unlike 1850–1870, the 1880 census includes several “special schedules” that each cover specific industries (like cheese, tanning, or sawmills), and each of these has its own unique variable names and formats. The scripts in this step clean and reformat those variables—removing inconsistent names, standardizing units (e.g., pounds, barrels, gallons), and creating clean product and material fields that align with the structure used for earlier census years.
For each of the 12 special schedules, the scripts extract relevant product or material information, reshape it into a consistent format, and assign standardized labels and units. They also group similar items (e.g., different kinds of leather or flour) into broader categories for comparability. Once cleaned, the product and material data are merged back into the main 1880 datasets. The result is a uniform dataset in which product and material information from the 1880 special schedules can be directly compared to the data from 1850, 1860, and 1870.