The Digital Humanities has existed as an institutionalized field of research at Stanford for now more than a decade, drawing undergraduates, graduate students, and researchers from around the globe. This series, a collaboration between Arcade and Stanford’s Center for Spatial and Textual Analysis, spotlights leading research in the digital humanities at Stanford, and asks key contributors to reflect on the expansion of the field, its culture, and the major misconceptions that remain.
Why has this field sparked so much public engagement in its projects and debates? How has the digital humanities changed what it means to be a more “traditional” humanist? And how is the field engaging new developments in technology like artificial intelligence?
In this interview, Interventions editor, Charlotte Lindemann, speaks with Digital Research Architect for the Stanford University Libraries, Nicole Coleman.
CHARLOTTE LINDEMANN: What initially drew you to the digital humanities?
NICOLE COLEMAN: The practice of digital humanities for me came with Mapping the Republic of Letters, an international and interdisciplinary research project that led to the formation of the research lab Humanities + Design; one of the initial labs at the Center for Spatial and Textual Analysis (CESTA). The creation of the Center itself was pivotal for digital humanities both at Stanford and elsewhere. And yet, the Center’s name very intentionally does not include the term “digital humanities.” At the time, we wanted the name of the center to capture the nature of the work we were doing. It was collaborative. Zephyr Frank, the founding director, came up with the beautiful acronym CESTA which means basket in Portuguese. We thought of what we were doing as not just about being digital—I remember I posted an article at the time that used another term, “laboratory humanities.” It reflected the idea that the work is collaborative and takes humanists outside of themselves in a way that’s different than reading and writing.
Most of the work that I’ve done has been in design and information technology; thinking about how we build technologies and systems to support scholarly work. The three initial CESTA labs, Spatial History, Literary Lab, and Humanities + Design were all applying digital methods, but they approached the work in very different ways. For example, the Spatial History Project was looking at the affordances of Geographic Information Systems mapping (GIS), a statistical mapping tool. It was exciting to be able to think about mapping and visualization as part of the process of writing history. And yet at the same time, I remember how deeply frustrated Richard White was about the fact that, in history, the line between two points is never straight but in mapping tools like GIS, that was always the case, not to mention the fact that temporality wasn’t part of those tools at all. They had to find ways to work around that. And their research ultimately did influence the design of the tools they were working with. At the Literary Lab, the former director, Franco Moretti, was very adamant about the need to embrace quantification for literary studies. Mapping Republic of Letters, the research project that launched Humanities + Design, was exploring early modern intellectual history and the exchange of ideas over time. But we realized there really weren’t existing tools that fit our needs. So, we started involving designers, bringing researchers in the field of design into the project to help us think, what form should this take? And, how might we design tools that suit our questions rather than trying to develop questions that fit the tools that exist? Visual analysis played a significant role in thinking about constructing data, shaping data, thinking about how we transform rich information sources into data. Visualization gave us a whole other kind of language with which to think about data.
I didn’t know that story about the founders of CESTA intentionally avoiding the term “digital humanities” in naming the center. I wonder if looking back on that decision, do you still have strong feelings about the term?
It’s not that I have a strong position on this, but I’m old enough that I remember, as a freshman in college, we had a special room where there was a “word processor.” It was a computer. I don’t even remember what kind of machine it was, Compaq or something, but it was a place where you went to write your papers. “Word processing” is now just “writing.” What’s more important to me than defining the term “digital humanities” is really thinking through our relationship to new technologies and making sure we have a critical position. I’d rather focus on what it is that we’re accomplishing. What are we doing? And how is it changing the field?
You’ve mentioned your experience working on the design of digital tools. I’m curious to hear some of the specifics of that kind of interdisciplinary work and how you keep the humanistic methods front and center.
I’ve been involved in several collaborations between computer scientists and humanists, and I’ve never known it to be easy. There are a number of reasons for that. How people approach research, and their research goals, is just so, so different across disciplines. I’m not talking about the kind of project that might hire a designer or a developer to build something for you, I’m talking about interdisciplinary projects that are genuinely collaborative, that have, if not necessarily shared research goals, then at least complementary research goals where we can find a point of intersection and learn from each other and see that there’s a mutual benefit. That’s where things become interesting, and where we can start to forge new paths.
In the early days of Mapping the Republic of Letters, we worked with computer scientists who were very willing and interested to be involved in humanities projects. But things would come to a standstill when they would needed us to formulate the work to be done as a problem to be solved. Their basic approach was hard to reckon with from the humanities perspective because humanists do not tend to think in terms of problems and solutions. The work rests in formulating questions. To some degree, I would say, that opposition still persists. It’s not that people are so much entrenched, it’s just that actually having the conversation and coming together is really quite difficult.
To give you an example, I was involved in this fantastic, funded research project. It was funded by the Hoffman-Yee Grant through the Human-centered AI Institute. A number of faculty and graduate students in the humanities, social sciences and computer science, came together. The project was called the Concept Time Machine, and the idea was to try to see if machine learning could really contribute to our understanding of how concepts change over time. The fundamental problem was a very different disciplinary understanding of the nature of a concept. From the computer science perspective, we start with a term. Does the definition of this term change over time? That approach does not contextualize the question historically or from a literary perspective, understanding the significance of meaning and how we shape our language. None of that was was part of the conversation.
That’s interesting. And I guess the presumption there is that words at any given time have a stable definition, which we would tend to disagree with as humanists.
Exactly. While I’ve seen these obstacles to interdisciplinary understanding persist there is tremendously exciting new work being done by Stanford graduate students and postdocs in the humanities. For example, Nichole Nomura, the Mellon Sawyer Postdoctoral Fellow at CESTA, recently gave a presentation where she made something measurable in humanistic terms that computer scientists developing word embeddings do not consider measurable. Word embeddings are essentially about measuring the distance between words in multi-dimensional vector space. I have heard computer scientists describe the results as ‘like magic’. For me, coming out of the library, it’s not magic at all, it's correlation. But it makes sense that it would seem magical, that large language models, for example, can reproduce things in a style that predicts what you want because they have been fed so much material. But it’s not anything like thinking or “intelligence.” That’s one problem. The other is that this kind of analysis can really miss the importance and the value of literature, of what we can understand about ourselves through literature or through history. With Nichole’s work on analogies, she is making her interpretive work as a humanist measurable. And to me that represents an important bridge to understanding and potential collaboration with computer science. This is going to be transformative because the technology is extremely powerful, so if you can take that technology and really train it in a way that teaches us something more about ourselves, that’s where it’s really going to matter.
So, if I’m hearing you, humanists should not be adopting new technologies and then measuring their own research outputs by those external standards coming from computer science? Instead, we should be taking the technology and creating our own ways of using it and our own standards for measuring the results. So, we’re still focused on research outputs and not just critique and questions, but we define those outputs on our own terms?
Well, I would agree with that too, but that’s not exactly what I mean. I think, oftentimes humanities scholarship is not about solving problems, as I said before. What really counts in this very engineering-dominated world that we’re living in now, this very technologically determined world, is measurement. So, if we can speak the language of measurement confidently, that is really powerful. And it gives us a way of translating what humanists do in terms that are more readily understood within engineering. It can help to counter the thinking that only engineering can address the urgent challenges we face in society and on the planet as a whole. In fact, there are a lot of problems that engineering has not been able to solve. And one could argue that engineering has also created a lot of problems. I’m a technologist, I’m a huge fan of technology in general, but we can’t just lean in one direction and lose sight of the importance of the humanities as well.
What I was trying to get at with the example of Nichole's research is that she does not generate word embeddings as merely a step in a process to enable some further analysis. Instead, she is engaging with word embeddings theoretically while asking How can this technology assist me in looking at something in a new way? Thinking about that process of mathematically creating representations of texts allows her to get at the expressive qualities of that process similar to the way an artist chooses a medium and uses it as an extension of thought. As a result, she can show us how to measure things, like analogy, that computer scientists do not consider measurable.
Let me offer another kind of analogy for this from computer vision research. The great advances that researchers like Geoffrey Hinton made against the ImageNet benchmark were described in terms of measuring the capabilities of computer vision against humans. Our model beat this benchmark and it is so much faster than humans at deciding whether this is a cat or a dog in an image. I think it is probably safe to say that for humanists, in general, that is absurd. It is like saying, I have this pair of binoculars that can see so much further than I can see. Well, no, the binoculars do not see, but I can use them as a tool to see farther than I can see without them.
The kind of work that goes on at CESTA, where graduate students work with and think with technology is unique. I see this, too, with the undergraduate internship program at CESTA. It was very important to Richard White, even before CESTA existed, when there was just the Spatial History project, to demonstrate that undergraduates could do research. Many of the undergraduate interns have come to Stanford to study computer science. CESTA provides an opportunity for them as undergraduates to learn how to do research in the humanities. I have spoken with a number of them who talk about how powerful an experience it is to take the skills that they’re learning in computer science and apply them in a humanities context where they can take a project from beginning to end and understand how the technology relates to questions we are asking about the world.
I have consistently heard from our undergraduate interns at CESTA that in their CS classes they work primarily on finding efficiencies, on optimization problems. It makes sense. Statistical machine learning is so expensive that, of course, the more and more efficient we can become at it, the better. So that ends up being the primary goal for an undergraduate. But at CESTA they have a chance to put computational methods into action, solving real world problems, guided by faculty and being exposed to humanities theory along the way. It gives me hope that if those students go on to work in Silicon Valley, at least they’ll take that with them. CESTA has so much potential for making these practical inroads and yet it’s underappreciated and underfunded.
This dovetails with the undergraduate teaching theme but where do you see the role of the public humanities in all of this?
Public humanities has been for many years a part of what CESTA does. Again, it’s challenging work to do because of just how competitive funding in the humanities is in the first place. In the past year or two, we’ve seen a lot more emphasis on public humanities. Having funding structures in place to support that work really helps. One source of inspiration for me lately on this topic is a recent co-authored book called Towards a New Enlightenment, the Case for Future-Oriented Humanities. A central theme of the book is reconciling theory and practice. From my position at Stanford in Silicon Valley, this idea is immediately relevant to my own experience with the challenge of reconciling humanistic theory with computational practice. The authors offer a productive way of thinking about how we make that happen at Stanford. The authors are concerned about the global crises, ecological and social, that we’re facing and the question of how the humanities are going to respond.
Part of the fundamental crisis that we are facing has to do with our relationship to technology, coming out of a particular Western tradition that views humans as separate from the technology that they create. There are other non-Western philosophical traditions that we could turn to that approach the question of how we relate to technology differently and that consider our relationship to the world as more integrated and even planetary in scope. We have to start thinking in that way and operating that way if we are going to stem the tide of problems that we’re causing for ourselves and for others. If we could bring these theories into the design of the systems, that would be the key that I think would just make all the difference. I remain optimistic. I don’t think that there’s even much opposition, just a lack of understanding and an ability to speak across the disciplines.
And perhaps like a lack of funding.
Absolutely. If you look at the Human-Centered Artificial Intelligence Institute, the amount of funding that they have received is astronomical. And of course, same goes for the Doerr School of Sustainability. There’s just not anything like that for the humanities and certainly not for CESTA. I believe strongly that we need to support the humanities writ large. One doesn’t need to be computational in their work to be making important contributions, of course, not by any means. But I do think, at the same time, that CESTA is important precisely because it is a place where we can start to come together across disciplines. That’s why I mention Nichole Nomura’s work, and the work of so many others, for instance, Merve Tekgürler, who’s the senior fellow in the graduate fellows program this year. Merve has a fantastic ability to think about history in light of machine learning. It really is an exciting moment where the conversation is starting to happen because these young scholars are adapting the technology to their disciplinary methods and helping to shape it. If we don't have a place where that happens, there is no interface between the humanities and emerging technology.
Since we’re talking about the future, are there current initiatives or projects that you’re particularly excited about?
I come from the libraries, so my interest in what happens at CESTA is also very tied to my interest in bringing statistical machine learning to library collections and to archives. Think of Optical Character Recognition (OCR), something which we just take for granted now, the ability to search all of this material from the 18th, 19th, 20th centuries. It has completely transformed research. That’s huge. And yet there is still so much material that isn’t suited to OCR—handwritten materials, but also materials in different languages. The challenge isn’t just about perfecting the technology, it’s also about developing more sophisticated ways of understanding these materials and the meaning that’s held in them, the difference between the form of handwritten materials and of print, for example. There’s so much that we could be learning right from this material that might be lost or compromised if we applied the technology in an unsophisticated way.
Right now, things built from an engineering perspective, and I’m thinking of machine learning in particular, tend to be very closed systems. That is to say, an end to end problem-solution system. For example, a chatbot. You can ask a question and then you get an answer. You can ask it to write you some code and it’ll do that. That’s interesting and very useful, but also very limiting. What is happening in the technology companies, now, that are focused on large language models is really about information retrieval, getting us information. But that has become equated with “intelligence” somehow. Even if we’re only speaking metaphorically. We have access to information on demand and at a massive scale that wasn’t available to us before. There’s no question that is valuable. But compare that kind of system to an information system like a library and start to ask, well, how are they different?
Well, a library, any library is an open system. It’s an ecosystem. Stanford Libraries is connected to libraries around the world. We exchange materials, we exchange metadata description. There are so many ways in which libraries are interconnected. In addition, we use technologies like classification systems that are similar to the way information is processed computationally. Classification can be problematic, even dehumanizing because it is reductive. It needs to shift and change over time, and it changes slowly because if it changed too quickly, there would be chaos. But the way that we manage it and make the whole information ecosystem viable and useful to scholars and useful to the public is with human beings. We have subject specialists, we have metadata experts, a reference desk. We have people who are doing all of this work of interpretation and decision making and curation and assessment that makes the information available to us. And it can be very particular and very personal. There is that human element, which is just absolutely essential.
Then you start to ask, could we incorporate this technology, this very, very powerful technology into an environment like this? I remember in the early days, I had started this AI initiative within the library, and part of that was staging conversations, just bringing in experts like Peter Norvig to speak to my colleagues in the library to get them familiar with AI and have them ask questions. There was so much talk at that time that, oh, AI is going to steal your job, and people should be afraid. And of course, the librarians are going to be afraid of losing their jobs. So, you should be careful about bringing this into the library. But, by and large, my library colleagues were full of great ideas about how this technology could be used to assist them. They know that the work they do, whether in curation, metadata or myriad other tasks, requires specialized knowledge and careful decision-making. They knew that they wouldn’t be out of a job.
I’m thinking of your metaphor of the binoculars, it's like we would never think that binoculars would replace anyone's job. You still need a person looking through the binoculars.
Yeah, and there’s so much material, as a tool AI has just a tremendous potential. If the kinds of investment that are going into things like HAI and the Doerr School were going into the library and the humanities as partners, it would just be fantastic.