We examine the future of annotation amidst ongoing social, scientific, and technological change. This book has discussed historic and contemporary relationships among annotation and information, commentary, conversation, power, and learning. We now consider an annotated future.
—Mike Sendall, annotation of Information Management: A Proposal
The computer scientist Tim Berners-Lee is best known as the inventor of the web. In 1989, while working at the European Organization for Nuclear Research, known as CERN, Berners-Lee sought “to persuade CERN management that a global hypertext system” was in the best interests of the organization. So he wrote Information Management: A Proposal and passed the document, with misnumbered figures and unfinished references, along to his supervisors. At least one version of the document was returned to Berners-Lee with an annotation written by his boss in the header atop the first page: “Vague but exciting…”1
Another proposal, of sorts, about a somewhat vague but exciting future was written by the computing pioneer Vannevar Bush in 1945 just months before the end of World War II. Bush’s essay “As We May Think,” published in The Atlantic, argued for the creation of new machines and processes that would make knowledge more accessible, assist creative thinking, and help people manipulate and link together data for the purposes of scientific inquiry.2 Bush’s explanation of knowledge records, the storage and transmission of information, and the extension of wisdom is read, today, as a proto-hypertext system, as an antecedent of the web.
“As We May Think” is perhaps most famous for its description of a machine that would augment the “associative trails” of the human mind. Bush called this machine the memex, a “device in which an individual stores all his books, records, and communications, and which is mechanized so that it may be consulted with exceeding speed and flexibility. It is an enlarged intimate supplement to his memory.”3 And this supplement to and extension of human memory utilized annotation. Users of the memex would “add marginal notes and comments” to their data. Annotation was a means of “tying two items together,” of building “trails of interest.”
Bush envisioned a future in which human cognition and collaborative knowledge building were processes augmented by intelligent machines. And that future was annotated.
In our final chapter we examine the future of annotation amidst ongoing social, scientific, and technological change. This book has discussed historic and contemporary relationships among annotation and information, commentary, conversation, power, and learning. We now consider an annotated future.
Of course, in such a future readers will continue to write in their books, irrespective of a book’s form, and private marginalia will persist as a cherished idiosyncratic practice. Many of us will continue reading annotation every day; notes will help us make meaning of our favorite media, as will notes feature prominently among signs, graphs, maps, other visualizations, and online comment systems. Educators will continue to mark up their students’ essays, providing both handwritten and digital feedback to encourage learning, just as students will continue to annotate textbooks as an indicator of their schooling and close reading. And journalists will continue to annotate and fact-check the news, scientists will map the genomes of new species, programmers will annotate their computer code, and troves of data will be labeled, both visually and semantically, and by both people and machines, to advance more sophisticated AI. This is all part of our annotated future. Annotation will continue to be an everyday activity.
The persistence and ubiquity of annotation will be matched, in equal measure, by the genre’s complexity. The future of annotation - a future that continues to provide information, share commentary, spark conversation, express power, and aid learning - will lack neither opportunity nor tension. Though we cannot predict specific social or technical shifts, we do have several questions about the future of annotation as a genre. What follows are three questions, perhaps a bit vague but (we hope!) exciting, that may help us to define our annotated future.
How can annotation become an information infrastructure?
Let’s inventory your annotation. Picture well-worn books, marginalia beside poetry, revisions to cookbook recipes, online commentary, and product reviews. Maybe you collaboratively annotate because of your work, studies, hobbies, or civic engagement. As you inventory everyday annotation, it’s likely not a stretch to suggest that you’re an annotator of media and that your annotation is multimodal. Your bookshelves, whether material and musty or digital and distributed, may overflow with annotation.
While some people might jot notes in a favorite novel never to crack the cover again, we suspect others want to revisit their annotation and make use of all these notes added to all these texts. Whether for personal interest or professional inquiry, perhaps you desire the ability to gather all your disparate annotation together, in the same place, at the same time, and in a common format that allows your annotation content to be searched, remixed, analyzed for patterns, incorporated into projects, and then archived for later reference. There is benefit from annotation guiding how you work with different content, learn from ideas, and produce something new.
In other words, annotation might strengthen your information infrastructure. Amidst an obsession with the labor of “knowledge workers” and “creatives,” annotation points toward a more concrete future defined by the ability to quickly access, easily use, learn from, and create with everything that you - and your friends and your collaborators - have written atop everything that you’ve collectively read. This social and technical capability will shape a future in which the work of journalists, scientists, learners and educators, scholars, and others will rely upon annotation as fundamental infrastructure connecting together people, ideas, and knowledge.
This annotated future was envisioned in Bush’s “As We May Think” and echoed a few decades later in Engelbart’s Augmenting Human Intellect. Today, the popularity of annotation has led to a proliferation of digital technologies and services, some of which are browser-based, others are stand-alone applications or are built into electronic publication platforms, and a smaller number are open-source. And yet, with most digital annotation tools, your annotation - including its content and metadata - remain tethered to a particular technology, readable and writable only on that platform, and subject to an often confusing (and unread) Terms of Service.
When you author annotation using Technology A or App B, what intellectual property rights do you retain, if any? Do you own your annotation data if using Company C’s service, and can you ever permanently delete your annotation? How do you transfer your annotation from Tool D to Platform E when starting a new project, and how might you publish your annotation for others to read and use? Perhaps paradoxically, the growth of digital annotation technology has created product-focused technical silos. As a result, it’s difficult for people to take full advantage of annotation strengthening their information infrastructure.
In the future, some people may pause before adding notes to a text if, in using a technology, they remain unsure about intellectual property protections, data use policies, or whether their annotation contributes to a broader information infrastructure. Might a day come when groups of people - initially professionals and members of knowledge communities, and then perhaps the broader public - refuse to author annotation using certain tools? Fortunately, both scholars and organizations have published resources that carefully evaluate annotation technologies and can usefully assist people in selecting appropriate tools.4 And, perhaps most significantly, the W3C’s creation of an open web standard in 2017 heralded a new and formative era in which companies, publishers, and developers can chose to design standards-aligned tools that are interoperable, perhaps also open-source, and that benefit you, the annotator - whether your information infrastructure is comprised of annotated web pages, PDFs, or epubs.
In the nineteenth century, standardized railroad gauges enabled Americans to build the transcontinental railway and connect people, places, products, and ideas. An annotated future envisions analogous infrastructure for digital spaces, across personal and professional contexts, contributing to a shared and robust information infrastructure. Standardized, interoperable, and open web annotation technology, like Hypothesis, now make it possible to interconnect readers, learners, professionals, content, and software. While the Hypothesis technology may, one day, become obsolete, the organization’s effort to establish a web annotation standard will persist and, we hope, influence others to help build a more open, accessible, seamless, and stronger information infrastructure for readers and their annotation.
Might annotation become an information infrastructure? We believe so. The question of “how” is now no longer solely focused on technological innovation but, rather, concerns shifting social norms, changing financial and organizational incentives, and evolving scholarly practices.
How should we read human-machine annotation?
Annotation has, for centuries, been mediated by technological innovation. From quill and ink, to the printing press and book formatting, to digital applications and platforms, annotation is - and always has been - tightly coupled to the technologies of the day. While people have used various tools and machines to annotate, the future of annotation will feature notes added to texts as people and machines work together in hybrid configurations. Our annotated future will also be authored by human-machine collaboration. How might we read, or make sense of, such annotation?
Intelligent technologies and methods are demonstrating the utility of human-machine annotation as automated, efficient, and accurate. Scientific research is a promising frontier for such human-machine annotation. Scientists from Google, MIT, and Cambridge University have created and trained a “deep learning” computer program capable of automatically annotating millions of protein sequences 200 times faster than similar models while reducing error rate through improved accuracy.5 Such deep learning models, according to the scientists, “can rapidly and efficiently annotate novel protein sequences” with implications for the future of molecular biology research, biotechnology development, and - generally - scientific progress.
Consider, as well, advances in the publication and use of biomedical research. There is a need for scientists to use stable and unique identifiers for the many antibodies, model organisms, and tools (like software) named and published in their literature.6 As scientists generate and incorporate Research Resource Identifiers (RRIDs) into publications, SciBot - an automated curation assistant - has crawled the scholarly web and automatically added thousands of open Hypothesis annotations to hundreds of research articles so as to identify, curate, and note any discrepancies associated with published RRIDs.7 SciBot has been called a “machine/human partnership” as human “validators” can assess SciBot’s accuracy and add additional information via annotation. With human-machine annotation helping to improve the quality of biomedical research and the value of publication metadata,8 it’s not trivial to anticipate similarly productive annotation-enabled advances in other fields and among other knowledge communities.
And yet, as we discussed in Chapter 5, human-machine annotation may not always be a generous, public-facing “partnership” given the complexities of power, labor, and profit. While human-machine annotation will be an important facet of our annotated future, so too may it also be troublesome. For hours each day, annotators repetitively draw bounding boxes around digital images to identify vehicles and train machine learning systems for subsequent and automated annotation. So, too, do labelers annotate social media data so as to infer relationships about personal preferences, to classify behavior, and to improve a company’s systems and services. In 2019, Reuters reported that over 250 contract workers in India have, surreptitiously and for years, annotated Facebook users’ status updates, photos, and content - without users’ explicit permission and likely in violation of the European Union’s General Data Protection Regulation policy - to improve the company’s data labeling efforts and the platform’s AI systems.9
Are data annotators working in “partnership” with algorithms and companies? Is it useful to describe these relationships as extractive manual labor or, alternatively, an innovation of the knowledge economy? And what of transparency and accountability associated with labeling practices?
Human-machine annotation will help propel the next generation of scientific discovery, safely guide autonomous vehicles about your city, and recommend another mini-series to binge. The authoring of human-machine annotation will proliferate, and new annotation arrangements will be automated and lauded for greater speed, accuracy, and insight. It will become increasingly normative, in many professional contexts, to arrange human-machine annotation whereby people and technology make intelligent decisions in coordination with one another.
At the same time, it will also be important to assess how different configurations of human-machine annotation reflect particular social, economic, and ethical commitments. The relationship between annotation and data ethics is becoming increasingly acute. The role of human-machine annotation in making sense of, making use of, and making profit from (often proprietary) data should be read as ethically ambiguous - perhaps promising, perhaps also problematic. Amidst criticism that AI can be biased because of human prejudice and ignorance literally coded into technical systems,10 it is important to also recognize how annotation - and, potentially, labelers’ biased annotation - contributes to the development and training of such technology. If our annotated future is to also be ethical, then various constituents must critically read and actively question the value and differential benefits of human-machine annotation.
How might you annotate for the public good?
We began our book by mentioning Special Counsel Robert Mueller’s report about Russian interference in the 2016 presidential election. We also discussed a flurry of journalism featuring annotation, including for the purposes of fact-checking. Both the Mueller report and its subsequent controversy as well as the trend in journalists annotating the news are but a few indicators that underscore America’s seemingly intractable “post-truth” era.
In his book Post-Truth, Lee McIntyre suggests a number of strategies for fighting back against “media fragmentation, information bias, the decline of objectivity, and the threat not just to knowing the truth but to the idea of truth itself.”11 McIntyre’s strategies include creating counternarratives that are supported by empirical evidence, repeating true facts in multiple venues and in various ways, learning to be skeptical and questioning confirmation bias, and influencing how people react to changing reality. Annotation, we believe, can assist with and help to strengthen all of these strategies.
We glimpse these strategies in the work of media literacy educators like Michael Caulfield whose students incorporate open annotation into their efforts identifying and analyzing fake news.12 We are heartened by the volunteer efforts of scientists with Climate Feedback whose annotated peer review of climate change journalism corrects the record and encourages media accountability. And we’re inspired by the Credibility Coalition, a group that is working to create “better standards, rigorous research, and thoughtful application around information quality.”13 Annotation is a part of their toolkit, too. The historian Anthony Grafton once observed, “text persuades, the notes prove.”14 Today, we tell people to “show me the receipts.” We’re not suggesting annotation is the answer to broken political dialogue or rampant misinformation in our post-truth moment. We are, however, hopeful that annotation will continue to usefully complement strategies that encourage truthful journalism, a healthier information ecology, and more informed public discourse.
We are also hopeful that the future of annotation will continue to broaden participation in civic, scholarly, and educational practices that promote transparency and inclusion. A project like Madison demonstrates how annotation contributes to crowdsourced governance through the collaborative creation of new public policy. Advances in open peer review suggest annotation can mediate productive conversation and help to improve the production of new knowledge. Science in the Classroom illustrates how annotation complements free educational resources, helps experts share their knowledge with novices, and can further develop students’ familiarity with disciplinary methods. And the Right to Learn Undergraduate Research Collective puts annotation to work in the service of social change for a more dignified educational future.
All of these initiatives leverage open annotation tools and processes. It’s not annotation technology that’s preventing governments from inviting constituents to annotate draft legislation or prohibiting scholarly organizations from opening up the ritual of peer review. It is, rather, organizational priorities and political will that inhibit the use of annotation for the public good.
We are hopeful about the future of annotation. Annotation is and will continue to be an everyday activity. And annotation is and will increasingly be encouraged by technologies that are easier to use, are more openly accessible, and that mediate meaningful participation in activities of social consequence. How might you annotate for the public good?
Our questions about the future of annotation are not complete without your shared inquiry. And our musings about an annotated future remain incomplete without your commentary. We invite you to share openly through social media your questions, thoughts, and annotation using #AnnoConvo. While this may be the end of a book, we hope it’s the beginning of a conversation about and through annotation in other digital spaces mediated by marginalia. Because marking up a book is an act of love, please annotate the various illustrations included in this book (and elsewhere, too!), and then place your ideas about Annotation into dialogue with other readers, annotators, texts, and contexts. Add your note to this text.
I agree that annotation has the potential to do all this — I remain hopeful. Not always sure that is enough. And annotation margins can quickly become akin to blog comment bins — the source of rage and conflict, as opposed to solutions and compromise.
Audrey Watters has contemplated some of this on her own website over the past several years. Privacy and issues with online bullying will also come into play as annotation gains more traction.
And the question of whose media/words is allowed to be annotated, and when, and by whom, is something to be considered. Can I keep annotation off my words? Should I?
Good question … and one is that is often not even clear if you read terms of service …
I agree, that if we step back and look close, we can see that the act of layering information on objects and other bits of information is a common everyday practice (is there anything better than sticky notes?)
At PubPub, we’ve seen this play out with various books put up for open peer review: Data Feminism, The Good Drone, and more. The annotations during open review improved the ultimate text.
Do the annotations remain part of the text set somehow? Do they remain part of the public record of a book development? Just wondering out of curiosity ….
I’ve always thought of Scibot as machine generated/human curated
If this is going to be a key component of your text — and the on-going dialogue — you need to mention it in Chapter 1 and provide some prompts for discussion throughout the text.
I may simply not be reading this right, but making the comparison of submitting an expense report in relation to the openly annotated future just didn’t ring for me here. Sorry, but perhaps you could find a different example?
I’ve not seen this used as slang before, but used in this context, the phrase and its new meaning were relatively clear to me. (They might not however be as clear to a non-native, non-Western English speaker.)
As I was reading it I was actually thinking of the phrase “Just the facts, ma’am,” which is often miss-attributed to the TV show Dragnet.
This is an interesting example, but I don’t know that it fully draws out all the ideas that you mentioned above related to “shifting social norms, changing financial and organizational incentives, and evolving scholarly practices.”
Perhaps you could reorganize around — and particularly elaborate upon — these three ideas in relation to SciBot?
As mentioned in the last chapter, an example is worth 1000 words.
Put in an example (or two) and elaborate. Show the reader a time/place where this kind of powerful annotation and interaction led to a change in social norms, financial and organizational incentives, and scholarly practices (or, better yet, all three).
Looking at Twitter as a digital annotation platform as mentioned previously in the text could certainly cover all of the examples Troy requires. Of course sub-examples within it are more likely what he may be asking for…
+ 1 more...
So, I have held off until now, but I have to ask… and only partially in a cynical manner…
Like the tree falling in the forest, does an annotation really make a sound (ripple, impact, effect, etc)?
That is, I appreciate your utopian vision, yet I wonder if you might want to reign it in a bit here.
Sorry… not trying to pop the bubble, especially after nearly two hours of reviewing and annotating your manuscript, but I am just being realistic.
The first few chapters had a few annotators. Now, here at the end, it is just me. And you two, as the authors. Are we really connected to a “robust information infrastructure?” Or, are the three of us walking alone in the woods?
One of the reasons that annotation has not caught on in the way that I think it should have (and I hope it will) is that annotations were stranded as comments on individual sites. Having access to all my annotations in one place makes me want to create more. I’ll admit that I’ biased, but I think people have been burned by creating content and not being able to move it or then losing it when the tool pivoted or disappeared.
This is an important, if technical, point, and deserves some elaboration. Why is it important that some are built into the browser, whereas others stand alone?
And, for that matter, why have you not mentioned OneNote, Evernote, Google Keep, or SimpleNote anywhere in the text, and especially here before you launch into the important questions you pose below?
This is the first time you mention this phrase which will be used throughout the rest of the chapter.
To that end, could you insert an additional sentence or two where you specifically define what you mean by “information infrastructure,” even though you have given a general description above?
Certain kinds of annotation, yes, will continue to be an everyday activity.
Other kinds, sadly, will not.
I think that you could elaborate and be a bit more specific here.
Again, sorry to be a nag, but I just don’t think that this happens as much as you are suggesting it does.
Even now, where students have access to ebooks (some in the public domain), I don’t see this happening due to material constraints and the lack of explicit teaching practices.
I agree with Troy. Even with the tools being available, that does not mean the practice is happening with any regularity or with depth. I think a lot of schools/teachers are given ebook version of texts, and provided minimal professional development on how best to use the format with students for annotation, etc.
Yet, was it collaborative? Dialogic? How does Bush’s vision of the memx align with your vision of what annotation is (and could yet become)?
It seems like Bush envisioned a system of layers, yes, but that it would be more supplemental information, not information that riffed off the original and then spanned new directions.