Summary
We now reproduce Census 2024 SPM work expenses and capped work+childcare expenses closely in aggregate after the recent childcare-cap fixes, but a narrow unit-level tail remains.
Current direct raw-CPS replication against the official 2025 ASEC public-use file for 2024 shows:
SPM_WKXPNS weighted total ratio: 1.00185
SPM_CAPWKCCXPNS weighted total ratio: 0.99958
SPM_CAPWKCCXPNS positive-unit MAE: about $17
SPM_CAPWKCCXPNS positive-unit share within $1: about 97.22%
So the broad formula looks right. The remaining issue is localized tail accuracy, not aggregate bias.
What remains
The biggest remaining gaps are a small set of units where we still choose the wrong lower-earner/reference-person pairing for the childcare cap.
Current pattern:
- remaining top misses are mostly not cohabitors anymore
- only about
12% of the top-100 cap misses are cohabiting units
- cohabitors account for only about
9.8% of top-100 absolute error
- the remaining tail is now mostly in more complex non-cohabiting married/multi-adult SPM units
Representative misses:
- underpredictions where Census allows a much larger childcare add-on than our selected lower-earner cap
- overpredictions where we still allow too much childcare relative to Census in a narrow set of units
Likely cause
The remaining tail appears to come from incomplete reconstruction of Census family-role logic in complex SPM units, especially around:
- exact reference person / spouse mapping in multi-adult units
- edge cases where tax-unit roles do not fully recover Census's SPM reference-person structure
- possible additional relationship fields or tie-break rules not yet carried through the CPS pipeline
Proposed follow-up
- Build a reproducible raw-CPS comparison script into the repo so this does not live only in local analysis.
- Audit the worst remaining
SPM_CAPWKCCXPNS misses record-by-record.
- Identify which extra CPS relationship/reference fields are needed to match Census role assignment in the remaining tail.
- Tighten the model only if the extra role reconstruction clearly improves the tail without hurting aggregate fit.
- Add regression tests for the newly understood edge cases.
Related work
Summary
We now reproduce Census 2024 SPM work expenses and capped work+childcare expenses closely in aggregate after the recent childcare-cap fixes, but a narrow unit-level tail remains.
Current direct raw-CPS replication against the official 2025 ASEC public-use file for 2024 shows:
SPM_WKXPNSweighted total ratio:1.00185SPM_CAPWKCCXPNSweighted total ratio:0.99958SPM_CAPWKCCXPNSpositive-unit MAE: about$17SPM_CAPWKCCXPNSpositive-unit share within$1: about97.22%So the broad formula looks right. The remaining issue is localized tail accuracy, not aggregate bias.
What remains
The biggest remaining gaps are a small set of units where we still choose the wrong lower-earner/reference-person pairing for the childcare cap.
Current pattern:
12%of the top-100 cap misses are cohabiting units9.8%of top-100 absolute errorRepresentative misses:
Likely cause
The remaining tail appears to come from incomplete reconstruction of Census family-role logic in complex SPM units, especially around:
Proposed follow-up
SPM_CAPWKCCXPNSmisses record-by-record.Related work