Assessing performance of ZCTA-level and Census Tract-level social and environmental risk factors in a model predicting hospital events

 Download Postprint PDF

Goetschius, Leigh G., Morgan Henderson, Fei Han, Dillon Mahmoudi, Chad Perman, Howard Haft, Ian Stockwell. 2023. “Assessing performance of ZCTA-level and Census Tract-level social and environmental risk factors in a model predicting hospital events.” Social Science & Medicine 326 (2023): 115943. doi:10.1016/j.socscimed.2023.115943

Summary

In the US, a lot of health data is analyzed using zip codes: they can easily be derived from addresses. Address zip codes are not addresses. They are mail carrier routes. The Census publishes Zip Code Tabulation Areas (ZCTAs), however, these aren’t the same as carrier routes. Both ZCTAs and carrier routes change quite a bit. So what happens if we geolocate addresses and then aggregate to the Census Tract level?

Results showed that increasing the granularity of area-based risk factors did not dramatically improve model fit or predictive performance. However, it did affect model interpretation by altering which SDOH features were retained during variable selection. Further, the inclusion of SDOH at either granularity level meaningfully reduced the risk that was attributed to demographic predictors (e.g., race, dual-eligibility for Medicaid).

The Modifiable Areal Unit Problem (MAUP) strikes again.