TAP data can help

Categories

Share this story

The National Institute for Computer Assisted Reporting held its annual conference in Atlanta last weekend, the first in-person conference since the March 2020 conference in New Orleans. Staff from the Investigative Reporting Workshop ran a demo of the Accountability Project and its search tool. If you didn’t make it to the presentation, slides are available here.

I’ve been working with The Accountability Project for years and have presented at several conferences. But this year I got a question I’d never heard before: “Why would we use this? Why not use Lexis-Nexis?”

My answer during the demo was that if LexisNexis has the answers you need, keep doing what works!

The purpose of The Accountability Project is fundamentally different: Our goal is to point researchers at records that contain what they are looking for and reference the source, so that they can research original records further or check for updates. You can see the complete listing of our data catalog here.

After the talk, another attendee — an editor at a regional news group with small publications strung across a dozen states — confirmed our thinking about who might benefit most. His reporters don’t have access to costly research tools, but he thought they’d be thrilled to start running names through The Accountability Project.

Among our offerings:

We recently released a dataset of U.S. companies previously granted licenses to do business with entities on the U.S. sanctions list. Read more about it here.
We have detailed data on grants from nonprofits and their top contractors extracted specifically for this project.
The official release of federal campaign-finance information leaves out small-dollar donors. We’ve recreated it by reprocessing raw FEC reports from pass-through PACs such as ActBlue or WinRed.

As we’ve built out The Accountability Project, it’s become clear that our simple framing narrative — that we index public datasets with names and addresses — doesn’t quite answer all the questions that are inherent in that data. For other use cases, we’ve deployed “datasette” — a simple tool that allows for prewritten database queries as well as real-time custom query handling.

Datasette has been useful for deploying data in ways other than a standard name or address search. During the pandemic, we began posting federal hospitalization data in this format, allowing for complex queries not otherwise available. You can see that datasette in action here. Adding in some simple charting tools allows for relatively quick visualizations. For instance, here’s the fraction of hospitalized COVID patients who are in ICUs. This metric dropped as the Omicron variant became dominant last December, putting a smaller fraction of patients in the ICU.

Another datasette that provides unique insight is an ad hoc query of disclosures about nonprofits’ transactions with insiders, which are disclosed on Schedule L of IRS 990s. You can search these by state here.

If you’re interested in researching any of the other datasets we’ve processed in more detail or have recommendations for other data we should include on the site, please get in touch with us at accountability@irworkshop.org. We’d love to chat.