Data for this project was collected from a wide range of sources. As with any project that combines various data sources, some tradeoffs have to be made in order to have usable data. The largest issue comes from the fact that much of the data provided for this project was at the zip code level. Beacuse zip codes are used by the U.S. Postal Service (and not an entity like the U.S. Census Bureau), they are constantly being updated. What’s more, zip codes are simply a collection of houses, and can’t necessarily be represented as polygons in the same way that boundaries of census tracts, counties, and other geographical entities can.
For this project, we used what are known as zip code tabulation areas (ZCTAs), which are provided by the Census Bureau. While ZCTAs mostly match zip codes, they don’t always. To use this data, we developed a list of zip codes in Washington County.
However, data from the U.S. Census Bureau on these ZCTAs did not have all of the above zip codes. The table below shows those that were missing. This should be taken into consideration when interpreting results.
In order to use data provided at the ZCTA or census tract level, we had to identify overlaps in between:
We did this using the sf package from the statitical analysis software R.
First, a table was developed that demonstrates overalps between ZCTAs and school catchment areas.
A second table was developed that demonstrates overalps between census tracts and school catchment areas.
When data was provided at the ZCTA or census tract level, it was merged into school catchment areas using the following steps:
In order to demonstrate the various shapefiles used, below are maps at the school catchment area, ZCTA, and census tract level.
Data on ERDC and Head Start was provided at the zip code level. It is shown at the school catchment area throughout using the merging procedure outline above. Because the number of ERDC and Head Starts providers within each school catchment area is calculated as an average of all the ZCTAs that it touches, the result is often not a whole number.
While the data provided by the Center for American Progress on child care deserts is helpful, it comes with caveats. Specifically, they wrote to explain a bit about how the data was collected:
The state of Oregon wouldn’t give out the street addresses of family child care homes, only the ZIP codes. Thus, when we geocoded the FCCHs, they all get placed at the center point of the ZIP Code, and then got assigned to the census tract that covers that point. So there’s a fair amount of error in the Oregon capacity numbers, which we’re attempting to rectify in the next version of this project.
Given the lack of precision on the locations of family child care providers, results should be interpreted with caution.
Please note that much of the data here has high margins of error, given that it covers small populations. All data should be interpreted with caution.
A zip file with all of the raw data used in the maps and tables is available here.