Skip to body

Center for Spatial Information and Research

Shift-Share Analysis, GIS, and Rural Economic Competitiveness


The objective of this project was to create a user-friendly internet site offering community planners, economic development officials, and other interested parties tabular data and maps relating to the competitiveness of county economies in Idaho, Oregon, and Washington State. The website, titled the Pacific Northwest Advanced Regional Economic Analysis System or PN-AREAS, was completed and made available to the public in February 2011.

The central function of the website is to provide access to a geographic information system (GIS) which maps data from: 1) a shift-share analysis (SSA) and 2) county-level data for a variety of factors that shape economic competitiveness. SSA was used to disaggregate county-level employment growth between 1998 and 2008 in the Pacific Northwest into several components: change attributable to national economic growth, change attributable to the local industrial mix, and a residual termed the regional shift. The latter component captures all the remaining influences that explain why a local economy performs either better or worse than would be expected based on national- and industrial sector-level growth. The main data source for this stage of the analysis was the US Census series County Business Patterns (CBP).

An important feature of this research was the use of goal programming optimization to estimate data suppressed in the CBP. To protect the confidentiality of individual firms, CBP does not report employment data in cases where there are few firms in a particular industry in a particular county (e.g. the mining industry in Lewis County, Washington in 2008). Data suppression is very common for rural counties given the inherently smaller number of firms. In place of suppressed data, the CBP reports a flag (i.e. A, B, C,..., L) indicating the interval in which the relevant value lies (e.g. E = 250-499 employees for mining in Lewis County). It is common for researchers who use CBP data to take the midpoint of an interval as a good estimate for the suppressed value (e.g. 375 in the case above). However, previous research has shown that this practice is not very accurate. An approach using linear programming and the data that are provided by the CBP (e.g. industry totals for a particular state, county totals for all industries combined, etc.) yields far more accurate estimates for suppressed data and that is the methodology used in this research. Specifically, the SNOPT solver from General Algebraic Modeling System (GAMS) was used to estimate thousands of suppressed values scattered across the 119 Pacific Northwest counties at both the 2-digit North American Industrial Classification System (NAICS) code level (e.g. 31 = manufacturing) and the three digit code level (e.g. 321 = wood product manufacturing).

The resulting estimates (e.g. 292 employees for mining in Lewis County in 2008) was incorporated into the CBP data and used for the SSA. The PN-AREAS website maps county-level summary employment data for all industries combined and county-level employment at the 2-digit NAICS code level. At either level of analysis (summary or 2-digit NAICS), users can map data for 1998, 2008, and growth between the two years as well as the NationalShare, IndustryMix, and RegionalShift values produced by the SSA. Further, PN-AREAS allows users to download the full 2-digit and 3-digit datasets. Due to server limitations and the complexity of the data set, it was not possible to allow the mapping of the 3-digit data.

To better understand some of the factors that shape the RegionalShift, the PN-AREAS website also maps and makes available for download a variety of county-level data for variables that previous research has found to be significant factors in regional economic competitiveness. These include (among others): population potential (i.e. population within 3 hours trucking), age of the housing stock, level of educational attainment, physical relief (the difference between the county’s highest and lowest elevations, used as a proxy for natural amenities), and a property crime index.

Most of the variables were freely available from US government sources in a form easily incorporated into the GIS, but several required analysis and processing. For instance, the physical relief data were produced using the Zonal Statistics tool in ArcMap. The most complex explanatory variable gives the number of people living within a certain driving time of a county center. To estimate this value for each county, the network analysis features of ArcMap were used in conjunction with: 1) a road network database prepared for another project; 2) detailed population data for both the US and Canada; and 3) various constraints such as a 90-minute penalty for crossing the US-Canada border to make the estimates more realistic.

For each variable, data layers were created in ArcMap for use in the GIS. A graphical user interface (GUI) was then developed to allow internet access to the GIS. This final part of the project employed several software applications, including Quantum GIS, PostgreSQL , Geoserver, UDig, and OpenLayers. The first three were used to translate shapefiles into map and data files suitable for display on the web. UDig was used for styling GIS layers. Finally, OpenLayers is a Javascript library for displaying layers in a webpage.