Saturday, May 9th, 2015
For the “Assault on Justice” investigation, WAMU 88.5 News web-scraped and downloaded the assaulting a police officer charges in the District of Columbia Courts from January 2012 through December 2014, found here. This produced a spreadsheet of more than 2,000 cases, including felony, misdemeanor and domestic violence assaulting an officer (APO) charges for all District-based police agencies.
The initial scrape included case numbers, dates and charges but left out many important attributes for deeper analysis. To obtain the added detail, graduate students from the Investigative Reporting Workshop at American University, under the guidance of editors and reporters at IRW and WAMU, researched every case using public access to the records at the District Courthouse.
The students entered each case number into the system, inspected the complete case documentation online and, if enough detail was present, printed the case record, including the affidavits from the arresting officers. However, some cases did not contain enough information or could not be located and did not become part of the final database for analysis.
Under IRW guidance, students and IRW staff entered the relevant data from the printed reports into the database. When completed, IRW editors cleaned the database for consistency and made integrity checks, comparing the database to the original paper documents. Once completed, the database contained 1,966 usable cases.
Many limitations still remain. IRW included a partial report in the database even though it was missing some data. For example, only 91 percent of the cases included the defendant’s race and ethnicity, although the rest of the form may have been complete. Other key, standard details were left off of otherwise complete reports.
Another limitation is the practice of listing multiple charges on one form. When attempting to analyze what happened in each case, there is only one outcome listed that applies to the first charge on the report. The outcomes on other charges, including some charges of assaulting an officer, are not recorded. This also leads to difficulty in counting misdemeanor charges of assaulting a police officer when the primary charge is a non-APO felony.
As in all data, other unknown errors likely exist. However, any specific incidents drawn from the database were checked against the paper record.
IRW ran descriptive statistics and other summary calculations on the data to find patterns behind the individual charges of assaulting a police officer. Because of missing data, most results should be treated as estimations and are expressed as such in the report. Also, IRW followed accepted rounding practices.
The map of arrest locations throughout the District was drawn from the database, which includes the arrest address. WAMU ran the arrest addresses through an online geocoder to give them latitude and longitude coordinates, which allowed the map designer to place the address on the map and add details, such as race of the defendant, to the map.