Prepared CMF Data Structure
An observation in the prepared data is one establishment, and is uniquely identified by the variables file_name and firm_number, and schedule in the 1880 special schedules. The following table shows the number of variables and observations in each cleaned dataset:
Dataset File Name | Observations | Variables |
---|---|---|
1850 | 124,873 | 329 |
1860 | 121,168 | 443 |
1870 | 203,353 | 753 |
1880 General Schedule | 170,811 | 102 |
1880 SS1 | 1,981 | 204 |
1880 SS2 | 677 | 271 |
1880 SS3 | 2,161 | 85 |
1880 SS4 | 3,083 | 121 |
1880 SS5 | 24,240 | 120 |
1880 SS6 | 4,856 | 82 |
1880 SS7 | 23,251 | 136 |
1880 SS8 | 3,508 | 96 |
1880 SS9 | 4,632 | 104 |
1880 SS10 | 128 | 70 |
1880 SS11 | 2 | 82 |
1880 SS12 | 1,381 | 97 |
Variable Availability
In the following variable availability tables, checks indicate that the variables are available, stars indicate that specified variables can be constructed from available data, and dashes mark variables that are unknowable.
Manuscript Variables
The following table lists out variables original to the manuscripts, and their availability by year. Since there are many manuscript variables unique to 1880, those are ommitted in this table.
Variable | 1850 | 1860 | 1870 | 1880 |
---|---|---|---|---|
Firm Name | ✓ | ✓ | ✓ | ✓ |
Industry | ✓ | ✓ | ✓ | ✓ |
Capital | ✓ | ✓ | ✓ | ✓ |
Materials Quantity | ✓ | ✓ | ✓ | - |
Materials Kind | ✓ | ✓ | ✓ | - |
Materials Unit of Measure | ✓ | ✓ | ✓ | - |
Materials Value | ✓ | ✓ | ✓ | ★ |
Number of Male Hands | ✓ | ✓ | ✓ | ★ |
Number of Female Hands | ✓ | ✓ | ✓ | ★ |
Number of Children Hands | - | - | ✓ | ★ |
Mean Male Wage | ✓ | ✓ | - | - |
Mean Female Wage | ✓ | ✓ | - | - |
Total Wages | ★ | ★ | ✓ | ✓ |
Months Active | - | - | ✓ | ★ |
Production Quantity | ✓ | ✓ | ✓ | - |
Production Kinds | ✓ | ✓ | ✓ | - |
Production Values | ✓ | ✓ | ✓ | ★ |
Production Unit of Measure | ✓ | ✓ | ✓ | - |
Power Kind | ✓ | ✓ | ✓ | ★ |
Machine Description | ★ | ★ | ✓ | ★ |
Number of Machines | - | - | ✓ | ★ |
Horsepower Measure | - | - | ✓ | ★ |
State | ✓ | ✓ | ✓ | ✓ |
County | ✓ | ✓ | ✓ | ✓ |
Township | ✓ | ✓ | ✓ | ✓ |
Closest Post Office | ✓ | ✓ | ✓ | ✓ |
Constructed Variables
The following table lists out the constructed variables available in the prepared data. An important note is that there are no 1880 general schedule constructed variables that are unique.
Variable | 1850 | 1860 | 1870 | 1880 |
---|---|---|---|---|
Broadest Industry Category | ✓ | ✓ | ✓ | ✓ |
Cleaned Industry | ✓ | ✓ | ✓ | ✓ |
Cleaned Material Kind | ✓ | ✓ | ✓ | - |
Cleaned Number of Children | - | - | ✓ | ★ |
Cleaned Product Kind | ✓ | ✓ | ✓ | ★ |
Cleaned Total Wages | - | - | ✓ | ★ |
Detailed Industry Category | ✓ | ✓ | ✓ | ✓ |
Establishment Number | ✓ | ✓ | ✓ | ✓ |
FIPS Code | ✓ | ✓ | ✓ | ✓ |
Granular Industry Category | ✓ | ✓ | ✓ | ✓ |
Image File Name | ✓ | ✓ | ✓ | ✓ |
Is Maker | ✓ | ✓ | ✓ | ✓ |
Is Manufacturer | ✓ | ✓ | ✓ | ★ |
Is Shop | ✓ | ✓ | ✓ | ★ |
Is a Factory | ✓ | ✓ | ✓ | ✓ |
Leontief Industry Category | ✓ | ✓ | ✓ | ✓ |
Machine Category | ✓ | ✓ | ✓ | ★ |
Machine Kind | ✓ | ✓ | ✓ | ★ |
Machine Unit | ✓ | ✓ | ✓ | ★ |
Material Kind Attribute | ✓ | ✓ | ✓ | - |
Material Units | ✓ | ✓ | ✓ | - |
Material Unsure | ✓ | ✓ | ✓ | - |
Material is Miscellaneous | ✓ | ✓ | ✓ | - |
Material is Note | ✓ | ✓ | ✓ | - |
Material is Service | ✓ | ✓ | ✓ | - |
Product Kind Attribute | ✓ | ✓ | ✓ | - |
Product is Miscellaneous | ✓ | ✓ | ✓ | ★ |
Product is Note | ✓ | ✓ | ✓ | - |
Product is Service | ✓ | ✓ | ✓ | ★ |
Product is Units | ✓ | ✓ | ✓ | - |
Product is Unsure | ✓ | ✓ | ✓ | ★ |
Uses Hand Power | ✓ | ✓ | ✓ | ★ |
Uses Horse Power | ✓ | ✓ | ✓ | ★ |
Uses Steam Power | ✓ | ✓ | ✓ | ★ |
Uses Water Power | ✓ | ✓ | ✓ | ★ |
Uses Wind Power | ✓ | ✓ | ✓ | ★ |
Unique 1880 General Schedule variables
There are a number of unique variables in 1880 that have analogous variables or are constructable in other years.
Variable | 1850 | 1860 | 1870 | 1880 |
---|---|---|---|---|
Total Materials Value | ★ | ★ | ★ | ✓ |
Adult Male Workers | ★ | ★ | ★ | ✓ |
Adult Female Workers | ★ | ★ | ★ | ✓ |
Children Workers | - | - | ★ | ✓ |
Maximum Workers | ★ | ★ | ★ | ✓ |
Skilled Daily Wage | - | - | - | ✓ |
Unskilled Daily Wage | - | - | - | ✓ |
Hours May-November | - | - | - | ✓ |
Hours November-May | - | - | - | ✓ |
Months 1/2 Time | - | - | - | ✓ |
Months 2/3 Time | - | - | - | ✓ |
Months 3/4 Time | - | - | - | ✓ |
Months Full-Time | - | - | ★ | ✓ |
Months Idle | - | - | ★ | ✓ |
Total Production Values | ★ | ★ | ★ | ✓ |
Number of Steam Boilers | - | - | - | ✓ |
Number of Steam Engines | - | - | - | ✓ |
Horsepower from Steam | - | - | ★ | ✓ |
Height of River Fall | - | - | - | ✓ |
Which River Used | - | - | - | ✓ |
Number of Water Wheels | - | - | - | ✓ |
Breadth of Water Wheels | - | - | - | ✓ |
Horsepower of Water Wheels | - | - | - | ✓ |
Kind of Water Wheel | - | - | - | ✓ |
RPM of Water Wheels | - | - | - | ✓ |
1880 Special Schedule Variables
There are 324 unique variables in the 1880 special schedules. The following file shows the availability for each of those across each schedule.