The Art and Science of Data-Driven Journalism

Computer-assisted Reporting

While the various histories of the development of computer-assisted reporting offer context for the work of today, most historians place its start in the latter half of the 20th century.30observers may not realize that many aspects of what is now frequently called data journalism are the direct evolutionary descendants of decades of computer-assisted reporting (CAR) in the United States. In fact, computing pioneer Grace Hopper, a computing pioneer, professor, and U.S. Navy rear admiral during World War II, made prescient predictions long before Nate Silver’s electoral prognostications made him a media star. In 1952, CBS famously used a mainframe computer, a Remington Rand UNIVAC, and statistical models to predict the outcome of the presidential race.31with a team of programmers to input voting statistics from earlier elections into the ENIAC and wrote algorithms that enabled the computer to correctly predict the result. The model she built not only accurately predicted the ultimate outcome”a landslide victory for Dwight D. Eisenhower”with just 5 percent of the total vote in, but did so to within one percent. (Their calculations predicted 83.2percent of electoral votes for Eisenhower; in actuality he received 82.4to accomplish something quite similar to what Nate Silver does six decades later: defy the election predictions of political pundits by using statistical modeling. In the years that followed this signal media event, change was slow, marked by pioneers experimenting with computer-assisted reporting in investigations. It was almost two more decades before CAR pioneers like Meyer Elliot Jaspin and Philip Meyer began putting cheaper, faster computers to work, collecting and analyzing data for investigative journalism. After he was granted a Nieman Fellowship at Harvard University in the late 1960s to study the application of quantitative methods used in social science, Philip Meyer proposed applying these social science research methods to journalism using computers and programming. He called this “precision journalism, which included sound practices for data collection and sampling, careful analysis and clear presentation of the results of the inquiry.”32to investigating the underlying causes of rioting in Detroit in 1967,33Free Press won the Pulitzer Prize for Local General Reporting the next year. Meyer’s analysis showed that college graduates were as likely to have participated in the riots as high school dropouts, rebutting one popular theory correlating economic and educational status with a propensity to riot, and another regarding immigrants from the American South. Meyer’s investigations found that the primary drivers for the Detroit riots were lack of jobs, poor housing, crowded living conditions, and police brutality.In the following decades, journalists around the country steadily explored and expanded how data and analysis could be used to inform reporting and readers. Microcomputers and personal computers changed the practice and forms of CAR significantly as the tools and environment available to journalists expanded. More people began waking up to “newsmen enlisting the machine,” as Time magazine put it in 1996.34journalists were using CAR techniques and databases in many major investigations in the United States and beyond.Data-driven reporting increasingly became part of the work behind the winners of journalism’s most prestigious prize: From Eliot Jaspin’s Pulitzer at the Providence Journal in 1979, to the work of Chris Hambly at the Center for Public Integrity in 2014, CAR has mattered to important stories. 35Brant Houston, former executive director of Investigative Reporters and Editors (IRE), said in an interview: The practice of CAR has changed over time as the tools and environment in the digital world has changed. So it began in the time of mainframes in the late 60s and then moved onto PCs (which increased speed and flexibility of analysis and presentation) and then moved onto the Web, which accelerated the ability to gather, analyze, and present data. The basic goals have remained the same. To sift through data and make sense of it, often with social science methods. CAR tends to be an umbrella term”one that includes precision journalism and data-driven journalism and any methodology that makes sense of data, such as visualization and effective presentations of data.By 2013, CAR had been recognized as an important journalistic discipline, as the assistant director of the Tow Center, Susan McGregor, explored last year in a Columbia Journalism Review article. 36 Data had become not only an integral part of many prize-winning investigations, but also the raw material for applications, visualizations, audience creation, revenue, and tantalizing scoops.