You are on page 1of 21

April

2012

Cultivating the Landscape of Innovation in Computational Journalism


By Nicholas Diakopoulos, Ph.D.

C UN Y G r adu ate S ch o ol o f Jou r n alis m To w -K nig h t C e nt e r for E n t re pr e ne u ri al Jo ur n al ism

1. Introduction
Technology is rapidly shifting the ways in which news information is gathered, produced, and disseminated. Some of the core areas of computing, like databases and information retrieval, are already hard at work driving many of these changes as news organizations re-adjust to the digital era. Yet the transfer and use of computing technology in news and journalism can be accelerated. This paper takes as its premise that there may be opportunities for computational innovation in journalism that have been overlooked or are underexplored. What are some of the other technologies, beyond databases and information retrieval, that can be used to help fulfill news consumers needs, to advance the goals of journalism, or to enhance the production and dissemination of knowledge for the public? We begin here to develop a process to systematically analyze and explore the potential for technical innovation in journalism, both to provide a more structured way to think about innovation in journalism, as well as to identify potentially overlooked or underexplored opportunities to create new value propositions in journalism. Systematic innovation consists of the organized search for change and the analysis of opportunities such change might offer for economic or social innovation1. Such a structured process needs to, at a minimum, consider: (1) What innovations are needed either to solve problems, meet user needs through new experiences, or increase efficiencies in processes; (2) Whether the innovation is technically feasible and how to make it work; and (3) Whether the solution fits the values of the intended users and is likely to be adopted. The crux of the process explicated here is user-centered and value-sensitive and approaches innovation both from the perspective of people producing the news (both professionals and non-professionals) and consuming the news. Placing constellations of ideas and concepts in improbable juxtapositions is often the source of new ideas. This is the basic supposition behind combinatorial creativity and is the reason why product innovation often comes from new uses or combinations of existing knowledge or technologies. This suggests an approach for systematic innovation that involves enumerating and then combining concrete concepts that span the space of interest. Through extensive reading into the literature we developed four such conceptual typologies that span our interest space and correspond to the three considerations proffered above. These typologies include (1) relevant dimensions of computing and technology, (2) news consumers needs, (3) journalistic goals, and (4) value-added information processes. These four typologies are geared towards helping to explore the space of product and process innovation, in particular by means of new computing technology. There are of course other types of innovation (e.g. marketing and organizational) that are not systematically considered in this paper2. Using the four typologies as a basis we carried out a review of relevant computing literature in order to assess areas of the conceptual space that have received more or less attention. Each relevant piece of literature was labeled with the concepts that it addresses and these labels were
1 2

Peter F. Drucker. Innovation and Entrepreneurship. Harpercollins. 1985. Oslo Manual: Guidelines for Collecting and Interpreting Innovation Data. 3rd Edition. OECD. 2005.

then used to produce a visualization of the conceptual space. Section 3 presents this visualization as a mechanism to identify promising areas of future activity. In section 4 we move beyond mapping what we found to developing a method to instigate the generation of new ideas in this conceptual space. We summarize and conclude our approach in section 5.

2. Conceptual Typologies
In this section we detail the four conceptual typologies used in our systematic innovation process. Extensive reading and research in the fields of computing, journalism, and communication was undertaken in order to identify and appropriately describe these concepts so that they could be used in our mapping and generative processes. The typologies, however, are not exhaustive, as some degree of relevance assessment was needed when deciding what concepts to include or exclude. Ultimately we strove to include concepts that were neither too abstract nor too specific, as we thought such extremes could be detrimental to effective literature mapping and idea generation.

Dimensions of Computing
We throw the word around sometimes computational this or computational that but what does the kernel word, computing, really mean? Definitions abound online, but perhaps the most canonical of definitions comes from Peter Denning, a professor and elder in the field of Computer Science (CS). In his words, Computing is the systematic study of algorithmic processes that describe and transform information. 3 Computing runs a strong parallel to journalism in that it is fundamentally concerned with information, but adds another focus on the algorithmic. Computing is about information, about describing and transforming it, but also about acquiring, representing, structuring, storing, accessing, managing, processing, manipulating, communicating, and presenting it. And computing is about algorithms: their theory, feasibility, analysis, structure, expression, and implementation. A fundamental question of computing concerns what information processes can be effectively automated. In modern CS there is an extensive body of knowledge that stems from this core notion of computing. For instance, the Computer Science Curriculum defined in 2008 indicates 14 different areas of knowledge 4 . These areas are often instantiated differently at different institutions. One institution with a useful distinction is the Georgia Tech College of Computing, which delineates some areas as belonging to core computer science, and others belonging to interactive computing. Roughly, core computer science deals with the conceptual (i.e. mathematical), and operational (i.e. nuts and bolts of how a modern computer works) aspects of computing. Interactive computing, on the other hand, mostly deals with information input, modeling, and output. There are aspects of professional practice, engineering, and design that apply in both. Some of the sub-areas of core and interactive computing are shown in Table 1.
3 4

Peter J. Denning. Is Computer Science Science? CACM 48 (4) 2005. Computer Science Curriculum 2008. http://www.acm.org/education/curricula/ComputerScience2008.pdf


Table 1. Sub-areas of core and interactive computing.

Core Computer Science

Discrete Structures, Programming Fundamentals, Software Engineering, Algorithms and Complexity, Architecture and Organization, Operating Systems, Programming Languages, Net Centric Computing, Information Management, Computational Science Human Computer Interaction, Graphics and Visual Computing, Intelligent Systems

Interactive Computing

Of the many sub-areas of computing its the interactive computing body of knowledge that is most interesting in terms of its potential application to journalism. The way that information is moved around inside a computer is less important for journalists to understand than the interactive capabilities of information input, modeling, and output afforded by computing. How does computing interface and interact with the rest of the world? Of course, many of the capabilities of computers studied in interactive computing rest on solid foundations of core computer science. You couldnt be a very productive data journalist without an operating system to schedule processes and manage files. We developed a catalog of twenty-seven computing concepts and technologies, mostly stemming from the three interactive computing disciplines of Human Computer Interaction (HCI), Visual Computing, and Intelligent Systems, but also drawing on some other relevant areas (e.g. information management, modeling and computational science) as we thought necessary. The full list of computing concepts and technologies is given in Appendix A. These concepts form the bedrock for thinking about how computing and technology can combine with the concepts discussed in later sections to generate new product and process innovations.

News Consumer Needs


What is it exactly that drives people to consume news information? By focusing on the user and really understanding the underlying needs, motivations, or habits influencing news consumption we can unlock new opportunities for creating new media products or for optimizing existing ones using computing and technology. Whats needed is a user model that describes the core facets or concepts of news consumption behavior. Its important to first make the distinction between how users consume news and why they consume it. How news is consumed is largely attributable to the medium of presentation (e.g. paper, radio, TV, tablet, internet). Certainly online social networks such as Twitter and Facebook have changed how people are exposed to and consume news. Different media bias news consumption in different ways, as their own unique affordances differentially enable, place constraints on, and influence behavior. The why of news consumption is perhaps more fundamental though, as the underlying needs and motivations for consuming news will simply be expressed differently across media in terms of exactly how those needs are met. There are a variety of demographic and contextual factors that can influence why people consume news or media. For instance, different studies have shown that younger people tend to

consumer news for the sake of escapism or passing time5, women on average tend to be less interested in news on some topics such as science and technology6, social co-viewing can influence people to watch television news for longer 7 , and personality traits such as extraversion and openness are linked to exposure to news in politics and public affairs8. Taken together this research seems to indicate that there is a lot of potential for marketing innovations that formulate and design media to appeal to different tastes and individual differences. But for our purposes here, for driving new ideas in product and process innovation, these differences are at the wrong level of granularity. What we really need are slightly more abstract dimensions that describe news consumers needs in a way that cuts across demographic and contextual factors. The uses and gratifications (U&G) theory describes such a set of underlying dimensions. Since the 1940s communication and journalism scholars have been studying why people seek out and consume media, and eventually this work evolved into what we call U&G theory today. What are the gratifications that people receive from various kinds of media or types of content that help to satisfy their underlying social and psychological needs? Some of the earliest studies looked at why people consumed radio news, and some of the more recent look at internet technologies such as commenting systems, but the dimensions described by U&G have proven to be remarkably stable over time and across media. The Uses and Gratifications theory attempts to explain how and why people select their media, as well as how concentrated their attention is9. For instance, casually attending to a report for entertainment or to pass time is different than goal-oriented information seeking. Uses and Gratifications suggests that there are four main categories of motives for why people consume media: (1) surveillance/staying informed, such as finding out about relevant events and conditions in your surroundings, (2) personal identity, including finding reinforcement for personal values and finding models of behavior, (3) integration and social interaction, including social empathy, finding a sense of belonging, or a basis for social conversation, and (4) entertainment/diversion, such as relaxing or filling time. Appendix B describes these concepts in more detail.

Journalistic Goals
Value sensitive design attempts to account for human values in a comprehensive manner throughout the design process by identifying stakeholders, benefits, values, and value conflicts

Diddi, A. and LaRose, R. Getting hooked on news: Uses and Gratifications and the formation of news habits among college students in an internet environment. Journal of Broadcasting & Electronic Media 50(2). 2006. 6 Hamilton, J. All the News thats Fit to Sell: How the Market Transforms Information into News. Princeton University Press. 2006. 7 Wonneberger, A., Schoenbach, K. and van Meurs, L. Interest in News and Politics or Situational Determinants? Why People Watch the News. Journal of Broadcasting & Electronic Media. 55 (3). 2011. 8 Gerber, A., Huber, G., Doherty, D. Dowling, C. Personality Traits and the Consumption of Political Information. American Politics Research. 39 (1). 2011. 9 Ruggiero, T. Uses and Grats theory in the 21st century. Mass Communication & Society. 2000, 3 (1).
5

to help designers prioritize features and capabilities10. This is an essential and important point for innovating new computational products and services since it helps to ensure the adoption of these innovations by the intended users. Value can be defined as what a person or group of people consider important in life. Values include things like privacy, property rights, autonomy, and accountability among other things. What does journalism value and how do these values drive the goals of the practice? Answering this will allow for the design of tools for professionals that are more easily adopted, and for the design of tools that more easily facilitate acts of journalism by non-professionals, since they can integrate better with the ethos of the process. Normative descriptions of journalism from both sociological11,12 and practical13,14 sources were consulted in order to identify core values and goals for our conceptual typology. These are shown in Table 2 according to whether they are primary or reinforcing factors. The three overarching primary values/goals are (1) striving for truth, (2) acting in the public interest, and (3) generally providing information about contemporary affairs of public interest. These can in- turn be conceived of as being reinforced by other values, goals, or activities. Truth is supported by values such as independence, which maintains a freedom from potentially biasing influence, and impartiality, which attempts to even-handedly cover noteworthy opinions or to manage personal or organizational biases. The core value of public interest leads to other valued activities such as watchdogging, and forum organizing. For instance watchdogging is a special type of public interest in that it involves holding public (or other) institutions accountable for their actions; it also contributes to supporting the primary goal of truth. Forum organizing, which is about orchestrating a public conversation and identifying and consolidating community, also works in the public interest by facilitating public information exchange. The last primary goal is informing and is supported by activities such as aggregating, sensemaking, and storytelling which all serve to add value to information, in different ways, before it is passed on to the public at large. Many of these values and valued activities can be seen as contributing to a notion of information quality the degree of excellence in communicating knowledge.
Table 2. Journalism values and goals and their relationships

Primary Truth Public Interest Informing

Reinforcing Independence, Impartiality, Watchdogging Watchdogging, Forum Organizing, Informing Aggregating, Sensemaking, Storytelling


10

Friedman, B., Kahn, P. H., Jr., and Borning, A. Value Sensitive Design and information systems. In P. Zhang and D.
Galletta (eds.), Human-computer interaction in management information systems: Foundations, 348-372. 2006.

Schudson, M. The Sociology of News. 2nd Edition. W.W. Norton and Co. 2011. 12 Kovach, B. and Rosenstiel, T. The Elements of Journalism. 2nd Edition. Three Rivers Press. 2007 13 APME Statement of Ethical Principles. http://www.apme.com/?page=EthicsStatement 14 ASNE Statement of Principles. http://asne.org/kiosk/archive/principl.htm
11

Appendix C describes all of these concepts in more detail.

There is considerable potential for technology to re-invent and re-imagine the activities of informing, truth-seeking, and acting in the public interest: to make them more effective, efficient, satisfying, productive, and usable. Being aware of these core values also helps designers understand what would not be acceptable to design for professionals (e.g. a platform to facilitate the acquisition of paid sources would probably not be adopted in U.S. journalism). Its important to emphasize that its the function served by the above valued activities thats significant more so than any institutionalized practices or processes already in use to accomplish these goals. In some cases it may be entirely appropriate for some institutional practices to be substantially re-made in the face of new technologies. Value-sensitive design offers a sensible way forward to ensure that products and services embed the kinds of values that will make them trustworthy, impactful, and resonate with journalists and the public.

Value-Added Information Processes


As we saw in the last section, one of the core activities of journalism is in providing information to the public of producing knowledge in a way that strives for truth. Its presented in various guises: articles, maps, graphics, interviews, and more recently even things like newsgames, but it all essentially entails the same basic components of information gathering, organizing, synthesizing, and publishing new (sometimes just new to you) knowledge. To be sure, the particular flavor of knowledge produced is colored by the cultural milieu, ethics, and temporal constraints through which journalism extrudes information into knowledge. Much of what journalists are engaged with on a day-to-day basis is in adding value to information, which includes things like making sense of it, making it more accessible and memorable, and putting it in context. Raw data and information is harvested from the world, and as the journalist gathers it and makes sense of it, puts it in context, increases its quality, and frames it for decision making, it gets more and more valuable to the end-user. And by value we dont necessarily mean economic, but rather usefulness in meeting a user need. This point is important because it implies that the value of information is perceived and driven by user-needs in context. Sometimes this process is described as a flow from data to information to knowledge (see Figure 1). Data are numerical entities or veridical facts. Information is about adding


Figure 1. The three stages of value adding: data, information, and knowledge. Note the recursive nature of knowledge production indicated by the interpret step, which creates a new atom.

relationships between these elements of data, or creating groupings and categorizations of data. Knowledge emerges when humans interpret, analyze, and judge information as a mechanism for driving decision-making. The process is cyclical or recursive, with the output from someone else, be it an article, tweet, or comment potentially feeding into the process for the next output. Stemming from his study of library information systems, Robert S. Taylor developed a model of value-added processes in information systems15. His model offers us a more structured way to think about how journalists and other knowledge producers add value to information. It also provides conceptual fodder for thinking about how technical innovation can be employed to enhance efficiency or effectiveness in these processes. Taylor organized the processes into four broad categories: ease of use, noise reduction, adaptability, and quality. Ease of use corresponds to aspects of information design (i.e. how to format and present information), browseability (i.e. non goal-driven information access), and ordering (i.e. ranking objects along some dimension of interest). Activities that journalists engage in to improve ease of use include transforming tables of numbers into compelling maps or graphics, or writing engaging and compelling articles that make it easier to understand and remember key aspects of a story. Another dimension of ease of use is physical accessibility, which can reflect the type and characteristics of the hardware used to deliver information. Noise reduction involves the processes of inclusion and exclusion (i.e. filtering) with an understanding of relevance that may be informed by context or end-user needs. Journalists are constantly engaging as noise reducers as they assemble a story and decide what is relevant to include and what is not, and even by their very judgment of what is considered newsworthy. Another dimension of noise reduction involves summarization, which serves to condense or simplify information while making sure it can still be validly interpreted. Finally, associating information, either though explicit links or through statistics, can reduce noise by making relevant and related information more accessible. Adaptability is a mechanism that adds value by ensuring that information is relevant to the specific needs or interests of a person with a particular information need or problem. If youre thinking about your audience, then youre probably already adapting to them in some way or another. In technology terms, personalization and recommender systems are geared towards adapting information environments to suit individual users. Finally, Taylors model includes five dimensions of information quality: accuracy (i.e. freedom from error), comprehensiveness (i.e. completeness of coverage), currency, reliability (i.e. consistent and dependable), and validity (i.e. well-grounded, justifiable, and logically correct). Issues of information quality as value-adds are of paramount importance to us here since striving for truth is a central value of journalism. Appendix D describes all of the value-added information processing concepts in more detail. In summary, one of journalisms primary raisons dtre is in gathering, producing, and disseminating information and knowledge. The processes used to produce this information and
15

Robert S. Taylor. Value-Added Processes in Information Systems. Praeger. 1986.

knowledge can be studied in conceptual terms such as those outlined above. But what is perhaps most interesting about these processes is that they can, in theory, all be executed either by people, or by computers. Its unlikely in the near-term that automated systems will fully replace people, but there are many opportunities for using technology to enhance the efficiency and effectiveness of these processes as they are guided by people.

3. Surveying Opportunities for Innovation


An extensive literature review was conducted with the goal of characterizing the landscape of whats already been investigated within the conceptual typologies we have defined. The literature review proceeded by examining two of the primary sources of computing literature, the ACM digital library, and the IEEE digital library. Both of these libraries, which house technical and computing literature, were searched using the keyword journalism to identify any research that identified itself as concerned with journalism. Additional targeted topical searches on the ACM library included information quality news media, summarization news media, storytelling news media, factcheck, watchdogging, aggregation news media, curate news media and sensemaking news media. A total of 3,181 results from ACM and 159 results from IEEE were triaged by reading paper titles and abstracts to select the most relevant ones. Relevance was determined by considering if the work addressed news information or journalism in terms of a novel system, application, or process. This resulted in a set of 101 papers. These papers were further assessed by carefully reading their abstracts and scanning the papers themselves. Each paper was coded according to any concepts in our framework that were addressed by the work. Its important to understand the limitations of this sample before moving on. This sample does not include literature published in communications or journalism venues that might also touch on technical issues. Due to the nature of the concept space, we focus solely on product and process innovation we do not include instances of what might be considered organizational innovation (i.e. relating to the organization of work processes of people), or marketing innovation (i.e. relating to the non-functional aspects of design). Our relevance criteria also excludes some research that might be more distantly or loosely related, or that produced knowledge that wasnt directly related to a product or process (e.g. user studies). Furthermore, we exclude computationally innovative products and services in the marketplace due to sampling issues and since the information necessary to evaluate such innovations is not publicly available. In the future, crowdsourcing may be a workable approach to surveying marketplace innovations. Despite the above limitations we would still argue that our review is based on substantial enough of a sample to begin to see some interesting patterns of activity. We developed a heatmap visualization in order to better understand how the literature covers our conceptual space. The matrix in Figure 2 (and rotated and blown up in Appendix E) shows the computing concepts along the vertical axis, and the user needs, journalism goals, and information processes along the horizontal axis. Each cell of the matrix indicates the number of research papers coded with both of those concepts, with darker red indicating more. The rows of the matrix have been sorted such that computing concepts that were observed more

10

frequently in the literature are towards the top. The relative prevalence of concepts is further shown in Figure 3 by ranking (a) the computing concepts and (b) all other concepts by the number of exemplars found for each concept. We can make several observations by inspecting the matrix and graphs. For instance, concepts such as natural language processing, data mining, social computing, and information visualization have garnered the most amount of attention in terms of their application to news and journalism. Topics such as machine learning, knowledge representation, information retrieval, and computer vision have also gotten some attention. But more sporadic, or no attention, has been paid to many of the other concepts. For instance, very little research has looked at how machine translation, tangible user interfaces, agents, or virtual reality can be applied to news information or journalism. But the dearth of existing work in these areas can be seen as an opportunity: one could imagine many inventive, innovate, and robust applications


Figure 2. A matrix visualization of literature as it falls into various categories of our typology. The darkest red corresponds to 12 papers and the lightest to 1 paper. Computing concepts are ranked from top to bottom according to the number of papers in which they were identified.

11

incorporating these concepts. For instance, if we were to consider combining tangible user interfaces with physical accessibility we might imagine a kiosk-based tangible news interface that increases the physical accessibility of news in public spaces like parks. Another class of technologies that hasnt been applied extensively to news and journalism includes areas such as robotics, augmented reality, wearable computing, and activity recognition. These are technologies that you might consider less mature since they may not support robust end-user experiences unless used in somewhat constrained scenarios. But as these technologies improve, future research and applications would be wise to explore here. From the matrix and graphs we can also see which user needs or journalism goals have been underexplored with respect to computing and technology. For instance, the user need of developing personal identity (not to be confused with personalization) does not appear to get any attention, at least in the literature reviewed. But one could readily imagine developing technologies or media experiences geared towards helping users find reinforcement for personal values or find models of behavior. Another possible explanation for the apparent


Figure 3. Bar graphs of the number of research papers found related to each concept for (a) computing concepts and (b) all other concepts.

12

dearth of activity on some of these needs and goals is that they may be more often or better met through organizational or marketing innovations, which are not included in our sample. For instance, for personal identity again, this need is probably more often met through marketing innovations that frame content to appeal to different personalities. Besides user needs, there are also journalism goals, such as independence or public interest, that have gained little, if any, attention in the computing research literature. But there are surely opportunities here. If we were to combine the concepts of information visualization with independence and public interest we might imagine a visualization that could help reflect the independence of journalists or their sources in a way that makes it clear to the public what connections or influences are at play. Other journalism goals such as watchdogging or forum organizing have also garnered little attention in terms of technical product or process innovation and are ripe for innovative applications of technology. For instance, organizing a user-generated photo forum related to the news (such as that found on CNNs iReport) could be facilitated by computer vision technology to help sort and order photos in useful ways. Finally, there are a number of opportunities for applying technology to value-adding information processes. In particular it seems there may be many opportunities to apply technologies to aspects of information quality including accuracy, comprehensiveness, currency, and validity. Quality in information processes coincides with the core journalistic goal of getting at the truth, but is nonetheless a tricky issue for news organizations. Using automated processes that expose weaknesses in quality after publication could be seen as detrimental to public credibility. Technology for considering quality processes thus needs to be integrated tightly into the overall processes (gathering, sensemaking, storytelling, and dissemination) of news production. For instance, we could imagine a wearable computer (a bracelet? earrings?) used for reporting interviews that could automatically link excerpted quotes to the original audio via speech recognition. When published this would provide more reliability (and transparency) for quoted materials.

4. Generating Opportunities for Innovation


The concept typologies described above include fifty-five distinct concepts: twenty-seven computing and technology concepts, four primary classes of user needs, ten journalism goals, and fourteen information processes (see Appendices A-D). But we want to move beyond description and survey, and in this section we address the question of What could be in this space? by formulating these concepts into a generative activity that can aid people in brainstorming new ideas that intersect concepts in interesting ways. In order to jump-start the creative process of understanding how intersections of these concepts could lead to innovations we developed a brainstorming card game. Each concept is given a card and color-coded based on its main category: red for computing, purple for user needs, blue for journalism goals, and orange for information processes (see Figure 4). The brainstorming activity we developed proceeds with groups of three people. The cards are split into two decks, one containing the computing cards and the other containing the rest of the cards. The decks are placed face down on a table. Each person in the group then picks a card at

13

random: one person picks from the computing deck and two pick from the other deck. This is to ensure that there is at least one computing concept in play. Combining the concepts shown on the drawn cards, the group is instructed to generate as many different ideas as possible in five minutes. Brainstorming can happen in many different ways, though we stress quantity of ideas since research has shown that stressing quantity over quality tends to ultimately yield more high-quality ideas16. A recorder in the group is identified and is tasked with recording the concept cards drawn and all the ideas that the group generates. A sequence of several five- minute rounds can be played to let everyone have a chance at seeing and combining different concepts. After several rounds of brainstorming each group then selects one or two ideas to share and discuss with the group.

Initial Experiences
To better understand if and how the activity was working and how to improve it, we presented the concepts and the activity to a class of fifteen entrepreneurial journalism students at CUNY Graduate School of Journalism. In a series of three 5-minute rounds of brainstorming, five groups generated 54 ideas in total, for an average of 3.6 ideas per group per round. In a follow- up session we also ran the activity with eleven media-industry professionals including people with backgrounds in technology or journalism, or both. In a series of five 5-minute rounds of brainstorming the three groups of industry professionals produced 53 ideas. Students and professionals thus produced comparable numbers of ideas. There was some variability between groups, but the overall reaction from students and professionals was positive, with several interesting ideas produced. The discussion phase, with the smaller groups presenting one or


Figure 4. Brainstorming cards.


16

Paulus, P. B., Kohn, N. W. and Arditti, L. E. (2011), Effects of Quantity and Quality Instructions on Brainstorming. The Journal of Creative Behavior, 45: 3846.

14

two ideas back to the larger groups was quite useful, both as a mechanism to congeal ideas as well as to provoke a dialogue as the idea ricocheted among a larger group. Some of the ideas generated were for general products or services, but some were also about how technologies could enable new kinds of stories to be told (i.e. editorial creativity). One idea for a general platform was to produce 3D virtual recreations of traffic intersections prone to accidents in order to help viewers get a better experience of why that spot could be dangerous. Another interesting idea involving the concepts of computer vision and summarization was to assess audience reactions to events by analyzing facial expressions of photos or videos of a crowd. This could be valuable not just for reporting on events, but perhaps also for audience testing other forms of media. A creative idea involving robotics was to present sports patterns or plays using robots to act out the dynamics. Could this be a new form of entertaining replay? In terms of editorial creativity some ideas included using motion capture technology to recreate crime scenes or analyses, or to illustrate workplace injuries from repetitive stress. Not all of the ideas produced were totally original, nor would they all make viable businesses or generate millions of clicks for publishers. But thats okay since our main goal here was to generate lots of fodder for the downstream process of winnowing by business or market criteria. In the future were interested in looking into ways of codifying evaluative criteria for what makes a good idea so that these can be further integrated into the overall brainstorming process. We are also interested to continue running the brainstorming activity with different types of participants, and with minor adjustments to instructions or duration of the activity.

5. Summary
The purpose of this paper has been to describe and articulate a systematic method for identifying and generating opportunities for innovation, particularly for products or processes, in computational journalism. The method is grounded in ideas of user-centered and value- sensitive design, which drive a need to understand news producers as well as consumers as we consider the application of computing and technology. By drawing on a wide range of literature we contribute an articulation of a set of fifty-five concepts spanning user needs, computing, and information processes. Using this concept space our visualization of computing research activity helps to identify underexplored areas such as tangible interfaces, agents, wearable computing, and activity recognition, among others. And finally, our brainstorming activity has been shown to be an effective and engaging method for students and industry professionals to generate novel ideas.

15

Appendix A Computing Concepts and Technologies


Twenty-seven dimensions of computing drawn from the interactive and core computing bodies of knowledge.

Social Computing
The intersection of social behavior and computing including collaboration (i.e. synchronous or asynchronous coordination between people), online communities and social networks, and social information processing (e.g. collaborative filtering, tagging, commenting).

Natural User Interfaces


Interfaces that mimic aspects of intuitive everyday human behavior and seem invisible and natural to the user (e.g. gesture, touch/haptics, speech, brain).

Tangible User Interfaces

Interfaces that use physical artifacts as representations and controls for digital information.

Mobile and Ubiquitous Computing

The integration of computing (e.g. sensing, information gathering, output) into everyday objects in the environment, and into portable devices.

Wearable Computing

The integration of computing into the personal space of the user in an unobtrusive and constantly accessible way (e.g. a computational prosthetic)

Information Visualization
The use of interactive visual representations of abstract data to amplify cognition or to communicate.

Media Synthesis
The creation or editing together of new media including visual images or multimedia.

Non-photorealistic Rendering Animation

The creation of non-realistic images (e.g. artistic, non-physically based). The creation of a series of images that impart motion to the objects depicted.

Motion Capture Virtual Reality

The capture and storage of geometric motion information from real objects. The realistic 3D simulation of an immersive environment.

Augmented Reality
The creation of a view of the physical world that is overlaid with digital media (e.g. 3D models, images, or other data).

Computer Vision

Techniques for producing information from images that can be used to drive decisions and processes (e.g. navigation, interaction, organization, detection).

Game Engines

Platforms that allow for rapid development of modeled and simulated environments.

Computational Photography

Image capture which utilizes computing (e.g. stitching, multiple exposure, tone mapping) or specialized optics that require computer processing before display (e.g. lightfields).


Agents

16

Autonomous or semi-autonomous entities that observe and act on the environment and direct their activity towards achieving goals.

Robotics

Mechanical agents that perform tasks and can be autonomously, semi-autonomously, or remotely controlled.

Machine Learning
Algorithms that allow for the recognition of generalizable patterns or categories from data which may facilitate intelligent decisions based on such data.

Natural Language Processing Machine Translation Speech Recognition

Algorithms that allow for the parsing and understanding of human language. Algorithms that allow for the automatic translation between human languages. Algorithms that allow for the recognition and understanding of spoken language.

Activity Recognition

Algorithms that allow for the understanding of behaviors in the environment based on sensors (e.g. sound, image, GPS).

Knowledge Representation

The modeling of knowledge to facilitate automatic reasoning and inferencing, such as through semantic networks or classification scheme.

Data Mining Hypermedia

The automatic or semi-automatic analysis of data to extract clusters, anomalies, or other relationships. The connection of information or media (e.g. graphics, audio, video, text) in such a way as to create a non-linear medium.

Information Retrieval

The search for relevant documents, information, or data often with respect to some human information need expressed as a query.

Modeling and Simulation

The construction and manipulation of abstract (i.e. mathematical or statistical) and often simplified representations and behaviors of real phenomena.

17

Appendix B News Consumers Needs


Four dimensions drawn from the uses and gratifications theory that help explain how and why people consume media.

Staying Informed

Finding out about relevant events and conditions in immediate surrounding, society, and the world; seeking advice on practical matters, or opinion and decision choices; satisfying curiosity and general interest; learning, self- education.

Personal Identity

Finding reinforcement for personal values; finding models of behavior; identifying with media actors; gaining insight into ones self.

Integration and Social Interaction


Insight into circumstances of others including social empathy; identifying with others and gaining a sense of belonging; finding a basis for conversation and social interaction; enabling connection with family, friends, society.

Entertainment

Escaping, relaxing, cultural or aesthetic enjoyment, filling time, emotional release, sexual arousal.

18

Appendix C Journalistic Goals


Ten journalistic values, goals, and activities drawn from normative descriptions of journalism practice and ethics.

Truth
Striving for accuracy, transparency, and context including assessing the truth-value of others claims.

Independence
Free from influence by those covered or monitored (e.g. governments, politicians, organizations).

Impartiality

An attempt to even-handedly and fairly cover noteworthy opinions on an issue, to manage personal or organizational biases, and to mark personal opinion clearly (e.g. editorial).

Public Interest Watchdogging

On the side of the public rather than for other actors like organizations or governments. Making sure powerful institutions or individuals are held to account for their actions.

Forum Organizing Informing

Orchestrating a public conversation and identifying and consolidating community. Gathering and reporting, enriching, and disseminating information that people need or want about contemporary affairs of public interest.

Storytelling
Striving to convey information in an engaging yet enlightening, relevant, and meaningful way.

Aggregating

Collecting, curating, and organizing information.

Sensemaking

Establishing informational relationships and context, and drawing valid interpretations.

19

Appendix D Value-Added Information Processes


Fourteen attributes or processes that can add value to information by making it more useful for a user need.

Information Design
Formatting and presenting information to facilitate ease of use and understanding.

Browseability
Presenting information to support casual rather than goal-driven consumption.

Physical Accessibility Ordering Filtering

Reducing physical barriers to use. Ranking or otherwise placing information in a logical sequence. Including or excluding information according to various criteria such as relevance, categories, or other discriminators.

Enriching

Augmenting information with metadata such as tags or other descriptors.

Summarization Associating

Condensing and simplifying information while maintaining valid interpretability. Defining relationships in information using hyperlinks, statistics (e.g. correlation, similarity, clusters), or semantics (e.g. A likes B).

Adaptability

Ensuring that information is relevant to the specific needs or interests of a person with a particular problem, such as through personalization, recommendation, or user-centered design.

Accuracy

Ensuring information is free from mistake or error and that it conforms to the truth insofar as the truth is knowable at the time.

Comprehensiveness Currency

Ensuring information is thorough and complete. Ensuring information is up-to-date.

Reliability Validity

Ensuring information and information sources are dependable, trustworthy, and credible. Ensuring information is well-grounded, justifiable, and logically correct as well as that assumptions are acceptable and factual evidence is relevant.

20

Appendix E Literature Survey Heatmap


A rotated and expanded view of Figure 2 indicating how the research we found spans the concepts in our typology. The darkest red corresponds to 12 papers and the lightest to 1 paper.

21

Acknowledgements
This work has benefitted greatly from the support and input of many people at the CUNY Graduate School of Journalism and Tow-Knight Center for Entrepreneurial Journalism including Jeff Jarvis, Jeremy Caplan, Peter Hauck, Jennifer McFadden, and all of the entrepreneurial journalism students and industry professionals that participated in our brainstorming sessions.

About the Author


Nicholas Diakopoulos is a researcher and consultant in New York City, specializing in human- computer interaction for computational media applications. He received his Ph.D. in Computer Science from the School of Interactive Computing at the Georgia Institute of Technology, where he helped launch the program in Computational Journalism. He was also a Computing Innovation Fellow at Rutgers University School of Communication and Information from 2009- 2011 where he researched social media and visual analytics as they relate to news and information. Nick can be contacted via email at nicholas.diakopoulos@gmail.com, and is online at @ndiakopoulos and http://www.nickdiakopoulos.com.