For News Consumers
Advocates of automated journalism argue that the technology benefits news consumers by providing new content that was previously unavailable and personalizes that content to meet the needs of the individual consumer. This raises two important questions. First, how do news consumers perceive the quality of automated news? Second, what are news consumers’ requirements regarding algorithmic transparency?
Quality of automated news
As noted in the previous section, journalists commonly judge the quality of automated content as poor or just “good enough” to meet minimum expectations around clarity and accuracy of the provided information. A key criticism of automated content is that it often lacks in sophisticated narration and sounds rather boring and technical. Experimental research from three countries, namely Germany, Sweden, and the Netherlands, suggests that consumer perceptions of the quality of automated news are similar to journalists’ judgments. In these studies, participants were asked to read articles written by either a human or an algorithm and rate them according to various aspects of quality.45 Despite using varied experimental designs and measures, the studies’ main findings were similar (for details see Textbox I). First, human-written news tended to earn better ratings than automated news in terms of readability. Second, automated news rated better than human-written news in terms of credibility. Third, and perhaps most important, differences in the perceived quality of human-written and automated news were rather small.
Textbox I: Evidence on the Perceived Quality of Automated News
In the first study of its kind, Christer Clerwall from Karlstad University in Sweden analyzed how people perceive the quality of news articles if they are ignorant of the article’s source.46 The experimental design reflected a situation in which publishers did not byline news stories, a practice that is not uncommon for wire stories and automated news.47 Clerwall presented forty-six Swedish undergraduates in media and communication studies with an article that provided a recap of an American football game. One group saw an article generated by an algorithm, and the remaining participants saw one a human journalist had written. None of the participants knew whether a human or algorithm had written the article he or she was seeing. The articles were written in English (and thus not in the participants’ first language), contained no pictures, and were approximately of the same length. Participants rated the article along various criteria that measured credibility and readability. Then, they had to guess whether the article was written by a journalist or generated by a computer. Interestingly, participants were unable to correctly identify the article’s source. Furthermore, the automated news article rated higher than the human-written one in terms of credibility but lower in terms of readability. In general, however, differences in quality ratings were small.
The results might seem surprising. Communication students, who would be expected to have a higher level of media literacy than average news consumers, were unable to distinguish between human-written and automated articles, and even perceived the latter as somewhat more credible. But what if readers are fully aware that they are reading automated news? How does this information affect their perception of the content’s quality? Two studies provide answers to that question.
The first study, which was presented at the 2014 Computation + Journalism Symposium at Columbia University’s Brown Institute, asked one hundred sixty-eight news consumers to rate one of four automated news articles in terms of journalistic expertise and trustworthiness.48 The articles were either correctly bylined as “written by a computer” or wrongly as “written by a journalist.” They were written in the participants’ native language (Dutch), contained no pictures, and covered the domains of sports or finance (two each). Participants were asked to rate the article’s journalistic expertise and trustworthiness. The results showed that the manipulation of the byline had no effect on people’s perceptions of quality. That is, news consumers’ ratings of expertise and trustworthiness did not differ depending on whether they were told that the article was written by a human or a computer.
The second study, which was conducted in Germany and presented at the 11th Dubrovnik Media Days in October of 2015, provides further evidence.49 This study used a larger sample of nine hundred and eighty-six participants, also varying the actual article source and its declared source. That is, instead of only using automated articles, the researchers also obtained ratings for human-written counterparts on the same topic. Participants were randomly assigned to one of four experimental groups, in which they were presented a human-written or automated article (either correctly or wrongly declared). The articles were written in the participants’ native language (German), contained no pictures, were of similar length, and from the domains of sports and finance (one each). Each participant saw two articles and rated their credibility, journalistic expertise, and readability. The results were similar to those obtained in previous studies. That is, participants’ quality ratings did not differ depending on whether an article was declared as written by a human or computer. Furthermore, automated articles were rated as more credible, and higher in terms of expertise, than the human-written articles. For readability, however, the results showed the opposite effect. Participants rated human-written news substantially higher than automated news.
When discussing potential reasons for the small differences, researchers suggested that consumers’ initial and perhaps subconscious expectations could have influenced the results in favor of automated news.50 According to this rationale, participants may not have expected much from automated news and were thus positively surprised when their expectations were exceeded, which potentially led them to assign higher-quality ratings. In contrast, subjects may have had high expectations for human-written articles, but when the articles failed to measure up to those expectations, they assigned lower ratings. If this rationale is true, then human-written articles should have scored higher when they were wrongly declared as automated news, and vice versa. However, evidence from the German study does not support this rationale.51 In fact, the results show the opposite effect. Human-written news was perceived less favorable when readers were told the news was generated by an algorithm. Similarly, automated news was rated more favorable when readers thought a human wrote it. The results thus support the experiences of James Kotecki, head of communications at Automated Insights, who reported that news consumers have high standards for automated content. In particular, Kotecki conjectures that “knowing the news is automated can prime readers to look for signs that a robot wrote it and therefore scrutinize it more carefully.”
A more likely reason for why news consumers perceive automated and human-written news to be of similar quality relates to the actual content of the articles. Again, the German study provides insights in this regard.52 Although human-written articles were perceived as somewhat more readable than automated ones, people did not particularly enjoy reading either of them. These results might indicate a general dissatisfaction with news writing, at least for the topics of finance and sports, which were the focus of the study. Such topics are routine and repetitive tasks, often performed by novice journalists who need to write a large number of stories as quickly as possible. As a result, routine news writing often comes down to a simple recitation of facts and lacks sophisticated storytelling and narration. Since the algorithms that generate automated content are programmed to strictly follow such standard conventions of news writing, the logical consequence is that the resulting articles reflect these conventions and therefore do not differ much from their human-written counterparts. Furthermore, if automated news succeeds in delivering information that is relevant to the reader, it is not surprising that people rate the content as credible and trustworthy.
In conclusion, the available evidence suggests that the quality of automated news is competitive with that of human journalists for routine, repetitive tasks. However, it is important to note that these results cannot be generalized to topics that are not solely fact-based and for which journalists contribute value by providing interpretation, reasoning, and opinion. Currently, automated stories for such complex problems are not yet available. That said, as noted earlier, the quality of automated news will likely continue to improve, both in terms of readability and the ability to generate insights that go beyond the simple recitation of facts. Future studies might even find smaller differences between the relative readability of automated and human-written content. That said, such effects may not necessarily persist as readers’ initial excitement with the new technology may fade if automated news that builds on a static set of rules feels redundant, especially if dispersed at a large scale. In this case, readers may be again drawn toward fresh and creative human writing styles, generating new opportunities for journalists.
It is up to future research to track how the quality of both automated and human-written news will evolve over time. In particular, it’s worth looking at how people’s expectations toward and perceptions of such content may change, especially for controversial and critical topics that are not merely fact-based. Future studies that analyze people’s relative perception of human-written and automated news should go beyond the previous work by focusing on the why: Why is it that automated news tends to be perceived as more credible but less readable than human-written news? This, of course, requires focusing on the articles’ actual content at the sentence level and might require collaboration with linguists. Another interesting approach would be to use web analytics data to analyze actual user engagement with automated content, such as the number and duration of visits.
For critical and controversial topics, as in automated stories that use polling data to write about a candidate’s chance of winning an election, it is easy to imagine that readers or certain interest groups may question underlying facts or criticize the angle from which the story is being told. Similarly, when algorithms are used to create personalized stories at the individual reader level, people may want to know what the algorithm knows about them or how their story differs from what other users see. In such cases, readers may request detailed information about the functionality of the underlying algorithms.
Researchers and practitioners from the field discussed such questions in March 2015 at an expert workshop, Algorithmic Transparency in the Media, held at the Tow Center and organized by Tow Fellow Nicholas Diakopoulos. In a first step, the experts identified five categories of information that consumers of automated content may potentially find interest: human involvement, the underlying data, the model, the inferences made, and the algorithmic presence.53 For example, readers might want to know who is behind the automated content—what is the purpose and intent of the algorithm, including editorial goals; who created and controls the algorithm; and who is held accountable for the content? The latter may also include information about which parts of an article were written by a person or algorithm, whether the final product was reviewed by a human editor before publication, and, if so, by whom. Regarding the source data, news organizations could publish the complete raw data or, if this is not possible (e.g., due to legal reasons), provide information about the quality of the data, such as its accuracy (or underlying uncertainty), completeness, and timeliness. Furthermore, readers may want to know whether, and if so how, the data were collected, transformed, verified, and edited; whether the data are public or private; which parts of the data were used (or ignored) when generating a story; and which information about the reader was used if the story was personalized. Regarding the actual algorithms, readers may be interested in the underlying models and statistical methods that are used to identify interesting events and insights from the data, as well as the underlying news values that determine which of those make it into the final story.
These questions provide a starting point for the kind of information news organizations might potentially reveal about their algorithms and the underlying data. However, experts identified these questions, so they may not reflect what audiences actually think. In fact, there may not even be a demand for algorithmic transparency on the user side, as probably only few people are even aware of the major role that algorithms play in journalism. This, of course, may change quickly once automated news becomes more widespread, and especially when errors occur. For example, imagine a situation in which an algorithm generates a large number of erroneous stories, either due to a programming error or because it was hacked. Such an event would immediately lead to calls for algorithmic transparency.
In his summary of the workshop results, Nicholas Diakopoulos points to two areas that would be most fruitful for future research on algorithmic transparency.54 First, we need to better understand users’ demands around algorithmic transparency, as well as how the disclosed information could be used in the public interest. Second, we need to find ways for how to best disclose information without disturbing the user experience, in particular, for those who are not interested in such information. The New York Times offers an example for how to achieve the latter in its “Best and Worst Places to Grow Up,” which provides automated stories about how children’s economic future is affected by where they are raised.55 When users click on a different county, the parts of the story that change are highlighted for a short period of time.