How we did it: Our exclusive Affordable Care Act database project

Digital First Media was the first news organization to build a national database for analyzing premium costs on the health insurance exchanges set up as a result of the Affordable Care Act.

This unique analysis shed light on how tax credits are making premium costs more affordable, but that large disparities remain in how much consumers will actually pay simply as a result of where they live. The story, by senior data reporter MaryJo Webster and St. Paul Pioneer Press health care reporter Chris Snowbeck, and the interactive map, by DFM news apps developer Vaughn Hagerty, published Feb. 28.

Here’s some insight into how we did it.

Q: Why did you think this was worth pursuing?

Chris Snowbeck, St. Paul Pioneer Press

Chris Snowbeck, St. Paul Pioneer Press

Chris Snowbeck: In covering the launch of the Minnesota exchange, we found there were significant premium disparities between the Twin Cities, Rochester, Minn., and nearby western Wisconsin. He compared the “benchmark” premium — a 50-year-old, non-smoker, buying the second-lowest-cost silver tier policy — to the same figures for the states in the federal exchange. That comparison showed the rating area encompassing the Twin Cities had the lowest premiums, while Rochester and an area of western Wisconsin just a short drive from St. Paul were both among the highest.

This raised two big questions that could only be answered with a complete national dataset: Would this hold true if we looked at the entire United States? And who would get the “better deal” — the consumer in the Twin Cities with low premiums and little chance of getting a federal tax credit, or the consumer in the higher-cost places that will benefit from the subsidies?

Q: How did you collect the data?

MaryJo Webster

MaryJo Webster, Digital First Media

MaryJo Webster: The data for the states in the federal exchange was nicely packaged in an Excel file that the Department of Health and Human Services had posted online. Collecting data for the 15 states that run their own exchanges was a bit harder. Once we figured out the correct terminology when making our request, this wasn’t as hard as we first suspected. Numerous public relations officials told us others had been requesting the same data. In some states, we had to go to the agency that regulates insurance; in other places, we got the information from the exchange.

We didn’t have to file official public records requests; there were a couple states that simply asked for a written request, though. A surprising number of states had this information on their website (although, not always easy to find).

Q: What did you ask for?

Webster: We decided to request, at minimum, the same thing that had been provided in the data for the federal exchange. That data showed the lowest-cost premium for each metal tier (bronze, silver, gold, platinum), in each rating area for individual consumers (not families and not those getting insurance through their employer). In most cases, we obtained a base rate and used age multiplier tables — also provided by the states — to calculate the premiums at each of the 45 age points used by the exchanges. We also made sure to get the second lowest-cost silver premium for each age and rating area. This is used as part of the calculation to determine how much tax credit a consumer would get. Several states gave us premiums for every product available on their exchanges, and we pulled out the lowest-cost for each metal tier and the second lowest-cost silver premium. The hardest part of all this was that most of the data from the state-run exchanges came in PDFs that had to be dumped out to Excel.

Q: How did you analyze this data?

Webster: Our goal was to determine what people at various ages, income levels and geographies would pay for health insurance premiums after federal tax credits were applied for those eligible. For individuals, the general rule is that you would be eligible for a tax credit if your income is at or below 400 percent of the federal poverty threshold. In the contiguous United States, that amounts to about $45,960 per year (it’s higher in Alaska and Hawaii). With such a massive amount of data, we knew we had to focus at least a little. So we chose to look at how the premiums and tax credits would play out for consumers buying the lowest-cost silver plan available to them. We chose the silver tier because enrollment data in the federal exchange show 60 percent of consumers are choosing silver tier plans.

We ran calculations to estimate the tax credit and ultimate consumer cost (plus what percentage of their income that would be) for people at key income points between 200 and 400 percent of the federal poverty threshold. We also calculated cost as a percentage of income at 401 percent of poverty, and other income levels that are above the threshold for tax credit eligibility.

Q: What difficulties did you encounter?

Webster: Initially, we were told that the tax credit calculation relied on the second lowest-cost silver plan for the rating area. Later after completing data collection, we found out there are places where there are different second lowest-cost silvers within a rating area. Typically, this is a case where one or more insurance plans are only offered in certain counties within the rating area.

We decided to stick with running our tax credit calculations at the rating area level, partly because there aren’t too many places where there are differences within a rating area, and the ones that do exist typically have premium variations of less than $25 per month.

Q: The project includes an interactive map comparing costs in the 501 rating areas across the U.S. How did you decide what to include?

Webster: The map does not include all of our data points; instead it allows users to compare costs for certain ages at key income levels. We figured putting all of the data into the interactive would be a little overwhelming and unnecessary. The goal is to show the general trends, not to provide specific costs for every scenario.

Q: What technology did you use to make the interactive map?

Vaughn Hagerty, Digital First Media

Vaughn Hagerty, Digital First Media

Vaughn Hagerty: The foundation for the interactive is Google maps and custom JavaScript. A key challenge for this interactive was to manage the amount of data involved: We didn’t want to have files so large that they’d make the app too slow to load or balky.

To create the shapes of the rating areas, we used ArcGIS to dissolve county-level maps (or in some cases, ZIP-code or city-level maps) and then stitch all of those back together into a national shapefile. Then we converted the shapefile to KML (Keyhole Markup Language, a geography-specific XML variant), and used PHP to pare that down into much smaller JavaScript files. The rating data also was converted to a JavaScript format aimed at minimizing file size for more than 35,000 records.

We had two goals for the app: Show some specific examples that illustrate how rates vary and allow users to explore the overall map to see variations nationwide or for specific regions.

For the first we created a “tour” that guided people through a number of examples throughout the country. For the second, we built tools to let people filter the data by certain age and income groups to see how those affect costs. We grouped those costs into four “buckets,” ranges that determined what color the rating area would be on the map for certain age and income criteria.

Q: What should we expect next year when consumers are shopping for 2015 insurance coverage?

Snowbeck: Experts told us some of these disparities might be a quirk that’s simply the result of this being the first year. Insurance companies might have set premiums higher or lower than usual because of uncertainty about who would purchase plans through these exchanges. As a result, it’s quite likely that we’ll see an entirely different picture at the end of this year when rates are set for 2015 insurance policies.

MaryJo Webster

By MaryJo Webster

MaryJo Webster is the Senior Data Reporter on Thunderdome's data team. She spearheads national data reporting projects for the Digital First Media network, and leads efforts to train journalists.

Leave a Reply