The Art and Science of Data-Driven Journalism

Gun Data, Maps, and Radical Transparency

The confluence of public data, digital media, and democratized publishing technology is going to lead media and advocacy organizations into challenging, uncomfortable places. Many of the issues data journalists face will be long-standing ones, like intransigent public officials or huge paper document dumps.For instance, in the 1990s the District of Columbia water authority refused to publish the results of lead testing after it showed widespread contamination. “We got the survey from a source, but it was on paper,” said Cohen. “After scanning, parsing, and geocoding, we sent out a team of reporters to neighborhoods to spot check the data, and also do some reporting on the neighborhoods. We ended up with a story about people who didn’t know what was near them.”In a harbinger of tensions to come, the Washington Post team chose not to publish the addresses of people identified in the data set. “The water authority called our editor to complain that we were going to put all of the addresses online”they felt that it was violating privacy, even though we weren’t identifying the owners or the residents,” said Cohen. “It was more important to them that we keep people in the dark about their blocks. Our editor at the time, Len Downie, said, ”You’re right. We shouldn’t just put it on the Web.’At the end of 2012, similar questions arose when The Journal News, a newspaper in New York, displayed the names and addresses310holders in an online map that was based upon the government’s regulatory data. The outrage311data was public and subject to a Freedom of Information law. Did that make it ethically sound to publish the names and addresses of permit holders?The question of what to do about guns, maps, and disturbing data312legislature and senate, when it passed legislation that created an anonymity exemption313this situation raised, however, will be central to data journalism in every state and country around the world.The conflict over guns and data showed how government data could be used by journalists in ways that could make many citizens quite uncomfortable.314highlighted an issue with data quality and journalism: More than three quarters of the data in the gun map was inaccurate.315The Journal News took the map offline3162013, although a version of it endures with zooming and data access disabled.The reality is that government data is already consulted and used daily by media. Given the increased reach and velocity of digital media, data journalists must be more conscious of ethics than ever. “Journalists broadcast and publish criminal records, drunk driving records, arrest records, professional licenses, inspection records, and all sorts of private information,” wrote Al Tompkins,317a senior faculty member at the Poynter Institute. “But when we publish private information we should weigh the public’s right to know against the potential harm publishing could cause.”Journalists need to know how to turn data into journalism in a way that serves the public interest without harming it.318lens, as Jeff Sonderman highlighted at the Poynter Institute, you’ll need to ask a series of basic questions. He wrote:In every situation you face, there will be unique considerations about whether and how to publish a set of data. Don’t assume data is inherently accurate, fair, and objective. Don’t mistake your access to data or your right to publish it as a legitimate rationale for doing so. Think critically about the public good and potential harm, the context surrounding the data, and its relevance to your other reporting. Then decide whether your data publishing is journalism.319journalism’s potential harms came up when Wikileaks released data from the U.S. Department of Defense and Department of State to multiple news organizations in 2010 and 2011. Every media organization that reviewed classified cables or logs from the Pentagon and State Department had to decide not only whether to publish them but how, balancing redacting the names of people who might be put at risk with the public’s right to know what was done on its behalf by government. The technical capacity to move through millions of lines of messy data in proprietary formats, however, only rests with a limited number of news organizations. If the capacity to do data journalism at scale isn’t democratized, this dynamic could enshrine traditional media power structures. “I helped out with the Wikileaks War Logs reporting,” said Jacob Harris, a data journalist at the New York Times. “We built an internal news app for the reporters to search the reports, see them on a map, and tag the most interesting ones. One of the unique things I figured out was how to extract MGRS [Military Grid References System] coordinates from within the reports to geocode the locations inside of them. From this, I was able to distinguish the locations of various homicides within Baghdad more finely than the geocoding for the reports. I built a demo, pitched it to graphics, and we built an effective and sobering look at the devastation on Baghdad from the violence.”320and Press FreedomIn the United States, data journalists often run into bureaucracy, obfuscation, or years of drawn-out wrangling over Freedom of Information Act requests, fees, and redactions. Journalists trying to acquire or use data in countries without freedom of information laws or democratic institutions have an even harder time gaining the raw material for their stories.Charles Andersen said that the issue of open government is hugely important to questions of data journalism’s future and relevance. Andersen, who co-authored a landmark report on post-industrial journalism with Emily Bell and Clay Shirky,321which increasingly includes efforts to open data”is probably the biggest factor in the success of data journalism in developing countries. “Data journalists have a very hard time existing in countries where there isn’t open data,” he said. “For instance, there’s a huge difference between Germany and the United States. Germany has relevant laws but a culture of not sharing.”The United States, at least by contrast, has a tradition of openness and government disclosure, said Andersen. Their research suggests that data journalism cannot exist in a given country without open government laws and policies. If elected officials, legislators, and staff want to see media using open data, they should also take substantive steps to ensure that policies, licenses, laws, and regulations are in place to permit that reuse. Similarly, if public services based upon open data feeds are performed by private parties, freedom of information laws in many countries may well need to be extended to the entities that deliver those services. Open data initiatives that aren’t accompanied by freedom of the press or freedom of information laws are unlikely to deliver on political rhetoric promising increased transparency or accountability.