Vague but exciting…
—Mike Sendall, annotation of Information Management: A Proposal
The computer scientist Tim Berners-Lee is best known as the inventor of the web. In 1989, while working at the European Organization for Nuclear Research, known as CERN, Berners-Lee sought “to persuade CERN management that a global hypertext system” was in the best interests of the organization. So he wrote Information Management: A Proposal and passed the document, with misnumbered figures and unfinished references, along to his supervisors. At least one version of the document was returned to Berners-Lee with an annotation written by his boss in the header atop the first page: “Vague but exciting…”1
Another proposal, of sorts, about a somewhat vague but exciting future was written by the computing pioneer Vannevar Bush in 1945 just months before the end of World War II. Bush’s essay “As We May Think,” published in The Atlantic, argued for the creation of new machines and processes that would make knowledge more accessible, assist creative thinking, and help people manipulate and link together data for the purposes of scientific inquiry.2 Bush’s explanation of knowledge records, the storage and transmission of information, and the extension of wisdom is read, today, as a proto-hypertext system, as an antecedent of the web.
“As We May Think” is perhaps most famous for its description of a machine that would augment the “associative trails” of the human mind. Bush called this machine the memex, a “device in which an individual stores all his books, records, and communications, and which is mechanized so that it may be consulted with exceeding speed and flexibility. It is an enlarged intimate supplement to his memory.”3 And this supplement to and extension of human memory utilized annotation. Users of the memex would “add marginal notes and comments” to their data. Annotation was a means of “tying two items together,” of building “trails of interest.”
Bush envisioned a future in which human cognition and collaborative knowledge building were processes augmented by intelligent machines. And that future was annotated.
In our final chapter we examine the future of annotation amidst ongoing social, scientific, and technological change. This book has discussed historic and contemporary relationships among annotation and information, commentary, conversation, power, and learning. We now consider an annotated future.
Of course, in such a future readers will continue to write in their books, irrespective of a book’s form, and private marginalia will persist as a cherished idiosyncratic practice. Many of us will continue reading annotation every day; notes will help us make meaning of our favorite media, as will notes feature prominently among signs, graphs, maps, other visualizations, and online comment systems. Educators will continue to mark up their students’ essays, providing both handwritten and digital feedback to encourage learning, just as students will continue to annotate textbooks as an indicator of their schooling and close reading. And journalists will continue to annotate and fact-check the news, scientists will map the genomes of new species, programmers will annotate their computer code, and troves of data will be labeled, both visually and semantically, and by both people and machines, to advance more sophisticated AI. This is all part of our annotated future. Annotation will continue to be an everyday activity.
The persistence and ubiquity of annotation will be matched, in equal measure, by the genre’s complexity. The future of annotation - a future that continues to provide information, share commentary, spark conversation, express power, and aid learning - will lack neither opportunity nor tension. Though we cannot predict specific social or technical shifts, we do have several questions about the future of annotation as a genre. What follows are three questions, perhaps a bit vague but (we hope!) exciting, that may help us to define our annotated future.
Let’s inventory your annotation. Picture well-worn books, marginalia beside poetry, revisions to cookbook recipes, online commentary, and product reviews. Maybe you collaboratively annotate because of your work, studies, hobbies, or civic engagement. As you inventory everyday annotation, it’s likely not a stretch to suggest that you’re an annotator of media and that your annotation is multimodal. Your bookshelves, whether material and musty or digital and distributed, may overflow with annotation.
While some people might jot notes in a favorite novel never to crack the cover again, we suspect others want to revisit their annotation and make use of all these notes added to all these texts. Whether for personal interest or professional inquiry, perhaps you desire the ability to gather all your disparate annotation together, in the same place, at the same time, and in a common format that allows your annotation content to be searched, remixed, analyzed for patterns, incorporated into projects, and then archived for later reference. There is benefit from annotation guiding how you work with different content, learn from ideas, and produce something new.
In other words, annotation might strengthen your information infrastructure. Amidst an obsession with the labor of “knowledge workers” and “creatives,” annotation points toward a more concrete future defined by the ability to quickly access, easily use, learn from, and create with everything that you - and your friends and your collaborators - have written atop everything that you’ve collectively read. This social and technical capability will shape a future in which the work of journalists, scientists, learners and educators, scholars, and others will rely upon annotation as fundamental infrastructure connecting together people, ideas, and knowledge.
This annotated future was envisioned in Bush’s “As We May Think” and echoed a few decades later in Engelbart’s Augmenting Human Intellect. Today, the popularity of annotation has led to a proliferation of digital technologies and services, some of which are browser-based, others are stand-alone applications or are built into electronic publication platforms, and a smaller number are open-source. And yet, with most digital annotation tools, your annotation - including its content and metadata - remain tethered to a particular technology, readable and writable only on that platform, and subject to an often confusing (and unread) Terms of Service.
When you author annotation using Technology A or App B, what intellectual property rights do you retain, if any? Do you own your annotation data if using Company C’s service, and can you ever permanently delete your annotation? How do you transfer your annotation from Tool D to Platform E when starting a new project, and how might you publish your annotation for others to read and use? Perhaps paradoxically, the growth of digital annotation technology has created product-focused technical silos. As a result, it’s difficult for people to take full advantage of annotation strengthening their information infrastructure.
In the future, some people may pause before adding notes to a text if, in using a technology, they remain unsure about intellectual property protections, data use policies, or whether their annotation contributes to a broader information infrastructure. Might a day come when groups of people - initially professionals and members of knowledge communities, and then perhaps the broader public - refuse to author annotation using certain tools? Fortunately, both scholars and organizations have published resources that carefully evaluate annotation technologies and can usefully assist people in selecting appropriate tools.4 And, perhaps most significantly, the W3C’s creation of an open web standard in 2017 heralded a new and formative era in which companies, publishers, and developers can chose to design standards-aligned tools that are interoperable, perhaps also open-source, and that benefit you, the annotator - whether your information infrastructure is comprised of annotated web pages, PDFs, or epubs.
In the nineteenth century, standardized railroad gauges enabled Americans to build the transcontinental railway and connect people, places, products, and ideas. An annotated future envisions analogous infrastructure for digital spaces, across personal and professional contexts, contributing to a shared and robust information infrastructure. Standardized, interoperable, and open web annotation technology, like Hypothesis, now make it possible to interconnect readers, learners, professionals, content, and software. While the Hypothesis technology may, one day, become obsolete, the organization’s effort to establish a web annotation standard will persist and, we hope, influence others to help build a more open, accessible, seamless, and stronger information infrastructure for readers and their annotation.
Might annotation become an information infrastructure? We believe so. The question of “how” is now no longer solely focused on technological innovation but, rather, concerns shifting social norms, changing financial and organizational incentives, and evolving scholarly practices.
Annotation has, for centuries, been mediated by technological innovation. From quill and ink, to the printing press and book formatting, to digital applications and platforms, annotation is - and always has been - tightly coupled to the technologies of the day. While people have used various tools and machines to annotate, the future of annotation will feature notes added to texts as people and machines work together in hybrid configurations. Our annotated future will also be authored by human-machine collaboration. How might we read, or make sense of, such annotation?
Intelligent technologies and methods are demonstrating the utility of human-machine annotation as automated, efficient, and accurate. Scientific research is a promising frontier for such human-machine annotation. Scientists from Google, MIT, and Cambridge University have created and trained a “deep learning” computer program capable of automatically annotating millions of protein sequences 200 times faster than similar models while reducing error rate through improved accuracy.5 Such deep learning models, according to the scientists, “can rapidly and efficiently annotate novel protein sequences” with implications for the future of molecular biology research, biotechnology development, and - generally - scientific progress.
Consider, as well, advances in the publication and use of biomedical research. There is a need for scientists to use stable and unique identifiers for the many antibodies, model organisms, and tools (like software) named and published in their literature.6 As scientists generate and incorporate Research Resource Identifiers (RRIDs) into publications, SciBot - an automated curation assistant - has crawled the scholarly web and automatically added thousands of open Hypothesis annotations to hundreds of research articles so as to identify, curate, and note any discrepancies associated with published RRIDs.7 SciBot has been called a “machine/human partnership” as human “validators” can assess SciBot’s accuracy and add additional information via annotation. With human-machine annotation helping to improve the quality of biomedical research and the value of publication metadata,8 it’s not trivial to anticipate similarly productive annotation-enabled advances in other fields and among other knowledge communities.
And yet, as we discussed in Chapter 5, human-machine annotation may not always be a generous, public-facing “partnership” given the complexities of power, labor, and profit. While human-machine annotation will be an important facet of our annotated future, so too may it also be troublesome. For hours each day, annotators repetitively draw bounding boxes around digital images to identify vehicles and train machine learning systems for subsequent and automated annotation. So, too, do labelers annotate social media data so as to infer relationships about personal preferences, to classify behavior, and to improve a company’s systems and services. In 2019, Reuters reported that over 250 contract workers in India have, surreptitiously and for years, annotated Facebook users’ status updates, photos, and content - without users’ explicit permission and likely in violation of the European Union’s General Data Protection Regulation policy - to improve the company’s data labeling efforts and the platform’s AI systems.9
Are data annotators working in “partnership” with algorithms and companies? Is it useful to describe these relationships as extractive manual labor or, alternatively, an innovation of the knowledge economy? And what of transparency and accountability associated with labeling practices?
Human-machine annotation will help propel the next generation of scientific discovery, safely guide autonomous vehicles about your city, and recommend another mini-series to binge. The authoring of human-machine annotation will proliferate, and new annotation arrangements will be automated and lauded for greater speed, accuracy, and insight. It will become increasingly normative, in many professional contexts, to arrange human-machine annotation whereby people and technology make intelligent decisions in coordination with one another.
At the same time, it will also be important to assess how different configurations of human-machine annotation reflect particular social, economic, and ethical commitments. The relationship between annotation and data ethics is becoming increasingly acute. The role of human-machine annotation in making sense of, making use of, and making profit from (often proprietary) data should be read as ethically ambiguous - perhaps promising, perhaps also problematic. Amidst criticism that AI can be biased because of human prejudice and ignorance literally coded into technical systems,10 it is important to also recognize how annotation - and, potentially, labelers’ biased annotation - contributes to the development and training of such technology. If our annotated future is to also be ethical, then various constituents must critically read and actively question the value and differential benefits of human-machine annotation.
We began our book by mentioning Special Counsel Robert Mueller’s report about Russian interference in the 2016 presidential election. We also discussed a flurry of journalism featuring annotation, including for the purposes of fact-checking. Both the Mueller report and its subsequent controversy as well as the trend in journalists annotating the news are but a few indicators that underscore America’s seemingly intractable “post-truth” era.
In his book Post-Truth, Lee McIntyre suggests a number of strategies for fighting back against “media fragmentation, information bias, the decline of objectivity, and the threat not just to knowing the truth but to the idea of truth itself.”11 McIntyre’s strategies include creating counternarratives that are supported by empirical evidence, repeating true facts in multiple venues and in various ways, learning to be skeptical and questioning confirmation bias, and influencing how people react to changing reality. Annotation, we believe, can assist with and help to strengthen all of these strategies.
We glimpse these strategies in the work of media literacy educators like Michael Caulfield whose students incorporate open annotation into their efforts identifying and analyzing fake news.12 We are heartened by the volunteer efforts of scientists with Climate Feedback whose annotated peer review of climate change journalism corrects the record and encourages media accountability. And we’re inspired by the Credibility Coalition, a group that is working to create “better standards, rigorous research, and thoughtful application around information quality.”13 Annotation is a part of their toolkit, too. The historian Anthony Grafton once observed, “text persuades, the notes prove.”14 Today, we tell people to “show me the receipts.” We’re not suggesting annotation is the answer to broken political dialogue or rampant misinformation in our post-truth moment. We are, however, hopeful that annotation will continue to usefully complement strategies that encourage truthful journalism, a healthier information ecology, and more informed public discourse.
We are also hopeful that the future of annotation will continue to broaden participation in civic, scholarly, and educational practices that promote transparency and inclusion. A project like Madison demonstrates how annotation contributes to crowdsourced governance through the collaborative creation of new public policy. Advances in open peer review suggest annotation can mediate productive conversation and help to improve the production of new knowledge. Science in the Classroom illustrates how annotation complements free educational resources, helps experts share their knowledge with novices, and can further develop students’ familiarity with disciplinary methods. And the Right to Learn Undergraduate Research Collective puts annotation to work in the service of social change for a more dignified educational future.
All of these initiatives leverage open annotation tools and processes. It’s not annotation technology that’s preventing governments from inviting constituents to annotate draft legislation or prohibiting scholarly organizations from opening up the ritual of peer review. It is, rather, organizational priorities and political will that inhibit the use of annotation for the public good.
We are hopeful about the future of annotation. Annotation is and will continue to be an everyday activity. And annotation is and will increasingly be encouraged by technologies that are easier to use, are more openly accessible, and that mediate meaningful participation in activities of social consequence. How might you annotate for the public good?
Our questions about the future of annotation are not complete without your shared inquiry. And our musings about an annotated future remain incomplete without your commentary. We invite you to share openly through social media your questions, thoughts, and annotation using #AnnoConvo. While this may be the end of a book, we hope it’s the beginning of a conversation about and through annotation in other digital spaces mediated by marginalia. Because marking up a book is an act of love, please annotate the various illustrations included in this book (and elsewhere, too!), and then place your ideas about Annotation into dialogue with other readers, annotators, texts, and contexts. Add your note to this text.