
The Digital Humanities has existed as an institutionalized field of research at Stanford for now more than a decade, drawing undergraduates, graduate students, and researchers from around the globe. This series, a collaboration between Arcade and Stanford’s Center for Spatial and Textual Analysis, spotlights leading research in the digital humanities at Stanford, and asks key contributors to reflect on the expansion of the field, its culture, and the major misconceptions that remain.
Why has this field sparked so much public engagement in its projects and debates? How has the digital humanities changed what it means to be a more “traditional” humanist? And how is the field engaging new developments in technology like artificial intelligence?
In this interview, former Interventions editor, Charlotte Lindemann, speaks with Roberta Bowman Denning Professor of Humanities and former Faculty Director of CESTA, Elaine Treharne.
CHARLOTTE LINDEMANN: Let’s start with how you entered this field, what initially piqued your interest in the digital humanities at Stanford or anywhere else?
ELAINE TREHARNE: I’ve been at Stanford for 12 years. My job talk, actually, was on technologies of text, “text technologies,” which I suppose is a field now, dealing with book history, digital humanities, and archives. I think the cross-temporality of my project struck a chord with my colleagues in the Department of English. And almost as soon as I got to Stanford, I was able to set up the Stanford Text Technologies project, which I anchored in the Center for Spatial and Textual Analysis (CESTA). So, I think even within a year, I was up there in the Wallenberg building on the fourth floor as part of that community.
Digital humanities is very much a part of what I’ve been doing for the last 20 years, probably longer actually. I’d argue that medievalists are particularly quick to pick up new technologies, partly because we are always alert to how easy it is for others to see the medieval as not relevant or not part of the broader literary discipline. Even from the late 80s, medievalists have been very mindful of the ways that technology can help to build and promote the field of research and teaching.
20 years ago, I set up a big project funded by the Arts and Humanities Research Council in the UK, and it was one of the first all-women long-term digital projects that was funded at the national level. It’s called “The Production and Use of English Manuscripts 1060 to 1220.” We finished that project in 2010 and 14 years later, it’s still fully functional. In fact, we’ve migrated it to Stanford now from the University of Leicester, where it was originally hosted. It’s in the Stanford Digital Repository, and it’s a fully searchable database of medieval manuscripts written in, or including, English that were not included in the main catalogues for this period. We created a database of complex cataloguing information for these materials, which is available Open Access for all.
That was my entree into the world of metadata and categorization and relational databases. Medievalists were using data transfer methods and hypertext years before everybody else, as I’ve suggested. In fact, one of the first images to ever be transmitted digitally was a folio of the Beowulf manuscript sent by the Old English scholar Kevin Kiernan, I think, in 1989, using his phone at the airport, and it cost him almost $100 to transfer that one image. I’ve been alert to these different kinds of ways of looking at texts and thinking about manuscripts and displaying information for quite a long time and am always looking at the merits of new media technologies and the potential of screen-based technologies to investigate medieval manuscripts.
There were originally three of us on the English manuscript catalogue project: Orietta Da Rold who’s now at the University of Cambridge, Mary Swan who was at University of Leeds, and myself, at Leicester. We were forced to engage with metadata through tagging for the database and coding for the manuscripts themselves. In this process, questions that I had always thought I knew the answer to—well, suddenly it became clear that I did not know the answer when I had to classify something as metadata. For instance, in a manuscript that has marginalia all around the central writing, we would ask is this main text or is it marginal text? And if we say it's main text, in what ways are we privileging something in the middle of a manuscript folio which may not actually be as important as the thing around the outside for our purposes of date or language? Dating is another good example. We can’t date many medieval western manuscripts, especially those that were written up to maybe 1250 or 1300, because scribes left few traces of their identity. And so, uncertainty reigns in my field. And that makes things like classification and quantification quite difficult; what happens when metadata insists on specific categories of date and no date can be offered beyond the broadest delimits? And those questions have stayed with me and will stay with me throughout my career. Everything I’ve done subsequently has been influenced by sitting at a kitchen table in my house, eating courgette soup, debating with my colleagues about how to label components when we need to create data from information. That was my foray into the digital. I became really interested in cataloging, classification, the creation of databases, and now also automated methods for the retrieval of information features of manuscripts at scale.
These are tools, I imagine, that come out of CS spaces or engineering spaces, or what have you. And so, to what extent do you see your research as interdisciplinary?
I’ve never been a fan of disciplinary boundaries because medieval studies is not a single discipline or field in the first place. In fact, my latest monograph is called Disrupting Categories 1050 to 1250: Rethinking Humanities through Premodern Texts and it asks how useful disciplines and periods are to work in the Humanities. To be a medievalist you have to be something of a Latinist, you have to be able to work with early languages. If you work with manuscripts, you have to be a paleographer and codicologist. You have to understand the history of the book. You have to be an editor and textual critic, and that includes digital editing. So, I already think of myself as interdisciplinary: a literary linguistic and historical paleographer or something like that. And maybe again, that’s why medievalists are so open to different methods from different areas of intellectual effort. I am really happy to use any and all tools at my disposal. And I am very keen to liaise with colleagues across campus; for example, chemists for the chemical constituency of pigments; engineers for automated tools; product design labs for thinking about ways that books are about design and technology. I’m interested in the work of my colleagues in physics, for quantum mechanics and mathematics, for things like chaos theory, things that help us to work with the unquantifiable or the vast tranches of uncertainty in certain areas of what we study. I work with students who are CS students, but with a humanistic inclination. And I think going forward, the only way for us all to flourish is to talk to each other more. It’s really that straightforward. Trying to convince some colleagues of that is actually much more difficult than it should be.
Where does this resistance come from?
First of all, resistance emerges through lack of time. It isn’t necessarily a kind of intentional resistance. But resistance also comes from the devaluing of humanistic skills, which we are particularly good at, at Stanford. If I hear one more time from students that, well, you know, “I’m really rubbish at math, but I can easily read a book” or something... Even some of my colleagues, in fact, will place a higher value on something they don’t understand like mathematics or artificial intelligence than on the ability to explore in detail a paragraph of dense eighteenth-century prose. That same assumption of difficulty doesn’t often get applied to the humanities. I remember very distinctly one of our senior managers and one of our university leaders, in fact, saying that they really understood the importance of humanities because they could play the cello. My point is, you might be very good at the cello as a hobby, but you’re not a professional cellist, there’s a difference. I’m a professional humanist and you cannot understand or easily replicate what I do unless you spend twenty years studying Old English and its manuscript contexts. In other words, we all have our skill sets that should be mutually respected and reinforced, campus wide, but also actually valued. And I do mean monetarily as well as intellectually.
The word that’s lacking in this discussion is “professional.” Professional humanists have years of training and high-level experience, just the same as an AI engineer, so our skills could benefit from the same respect. We have lots of research projects on campus that seek to be interdisciplinary, but they don’t tend to include professionally trained humanists. For instance, say the topic is medical humanities. We might see colleagues in other disciplines bringing in doctors who write poetry, but not including professional poets or literary scholars. A lot of the human-centered artificial intelligence and arts initiatives on campus do not actually involve professional humanists. I’ve read books on neurosurgery, but I wouldn’t think to start claiming to be a neurosurgeon. And while the results would not be so drastic if you think you’re a humanist because you can play the flute, you’re not, and I see a significant undervaluing of professional, humanistic skills even from the humanists themselves.
How do you change students’ minds and teach them about the value of the humanities?
In different ways, often one-on-one. It’s great that CESTA can bring in CS interns and offer these students a different environment for them to work in and different kinds of projects for them to contribute to. I was teaching in Oxford in the winter quarter, which was just a wonderful opportunity. I had 14 students in my class on cultural heritage, and only a few of them were humanists. Probably half of them were CS majors. In the work that they did for me, especially, the writing, I could see that some of them were asking questions like a humanist. So, I raised this with them and encouraged them. I’ve continued to work with several of those students subsequently. One of them is an advanced AI student who wanted to consider how we can nuance Large Language Models which have bias incorporated within their outputs, such that in that nuanced transformation there is an intentional mitigation of harm from the racist, anti-feminist or sexist stereotypes present in these models. The student developed an additional layer in the model at a final level of the transformer process. Her test cases suggest that it works to an impressive extent. Of course there’s still work to be done, this was just one independent study.
That sounds like a tremendous accomplishment.
The student did a lot: they read about sociolinguistics, decolonizing the curriculum, data feminism. They worked on designing models. They looked at Dan Jurafsky’s work and Londa Schiebinger’s research. And they worked in other languages, specifically South Asian languages, carefully considering translation as a mode through which persistent replication of biases can occur, which is right, of course.
Translation itself or translation through large language models?
Translation through Large Language Models. For example, the student compared how the model responds when you prompt it and ask questions about illnesses and treatment in English versus Hindi. English might give you a much more helpful and specific set of things to do in the event of certain symptoms, whereas the Hindi translation talked about remedies using stereotypes of South Asian cultures and was much less likely to give you efficient, contemporary medical advice. That was shocking. We have a research student at CESTA who is working on Ottoman Turkish and trying to mitigate different forms of bias in the translation model that they are building. So, in answer to your question, I think you communicate these investigations and successes to students on a one-by-one basis through mentorship, or through small classes, or the tailored Digital Humanities Minor. If you teach a big class, you might be able to get 10% of that class back into humanities, even if they come from CS, or another STEM area. It has been a hard sell but is less of a hard sell now because some students, as we see them coming in, are actually more predisposed to interdisciplinarity than they were previously. I believe Creative Writing, for instance, is the largest minor on campus. I think we’re in a fairly good position to capitalize on a shift in students’ academic desires that we’re seeing post-pandemic.
I’m excited to hear it. Are there new trends or developments in the field that you are particularly hopeful about?
AI is one of the big ones, isn’t it, because of the promise that we can infer it has. I use AI a little, but I’m just messing around with it, really. I get GenAI to write Old English poetry for me so I can see how it responds, what data it has ingested that it understands as early English. I ask it questions about medieval manuscripts that we don’t know the answer to, to see in what ways it guesses. And it’s often completely erroneous--hallucinating, but so convincingly. That’s a concern and a threat to knowledge. AI isn’t self-critical, so the user needs to be deeply critical themselves. If you’re not an expert in the field of knowledge you’re asking about, you are not going to be able to tell whether what you get back as a response is right or not.
I’m sure for you, the Old English poetry it produces is laughable, whereas for somebody who’s not a specialist, they might potentially mistake it for the real thing.
Well, it’s super clever. It creates poetry out of pre-existing half lines. So, it’s probably undetectable if you don’t know that these things are not real verse-lines, or these two half lines can’t be put together because of the stress pattern, or whatever. In other words, if you’re not already an expert, you just will not know if it’s right. What are you going to do with that? How many books are we going to see with material that is actually just wrong? That’s obviously a concern, but the potential of the tools is also exciting.
Right. And it sounds like some of the projects even your own undergraduate students are working on are aimed at providing the kind of guardrails that this technology might need.
Yes. And again, that’s what makes humanists so valuable, their critical and analytical abilities and the fact that they don’t necessarily look for clear answers. On the contrary, deep questioning often results in further, open-ended questions. For instance, during the pandemic, I was teaching a poem, Wulf and Eadwacer, and I told the students I don’t know who wrote it or why, or for whom or when or how it was written, and I also don’t know what it means with any certainty. A student wrote back in the chat function on Zoom—and I think he thought he was sending a message just to his friends, but it was sent to everyone—that said, “She doesn’t know anything about it!” And, you know, I told them that’s very cheeky; I can see that comment. And it’s not that I personally don’t know because I’m no good at what I’m doing; I don’t know these key elements because these aspects are not known. The implication is that if you can ask the question, you should be able to look it up on the internet and find the answer. The internet will certainly provide an answer, though whether or not it’s accurate will depend. And that is only going to be augmented by AI, which will fill in gaps with complete fabrication.
AI is a threat for all of us because it can generate misinformation, but it’s particularly, dangerous for anyone whose field is not predominantly English, or one of the other big global, written languages. One of the big projects I’m in at the moment is called SILICON (silicon.stanford.edu) and it’s concerned with enhancing the digital presence of digitally disadvantaged languages. We’ve got students now working on coding and on designing keyboards for digitally underrepresented languages in South Asia, the Americas, and Africa. We’re trying to make sure that these languages have a presence through Unicode so there’s a digital world in which they can exist. Because it’s bad enough that they’re endangered languages, but when they don’t have any digital presence, they might be prevented from competing—culturally, socially, intellectually, socioeconomically.
You’ve mentioned your students. I wonder if you can speak a bit more about the role of students at CESTA, or how teaching and research relate for you.
CESTA is probably the world’s leading transgenerational humanistic-scientific center. The world’s leading. And I say that because of the program that we have for our graduate interns, but also more specifically, our undergraduate interns. This deliberate transgenerational approach to the dissemination and the gathering of information and expertise is truly extraordinary.
Transgenerational?
Meaning across university hierarchies like undergraduate, graduate, faculty, staff. I ran a project—this was Cybertext, actually—that, over time, involved myself, a project manager, a postdoc, three graduate students, and a whole team of undergrads. Transgenerational projects create a community where undergraduates can feel comfortable asking questions of the people around them. From my perspective, that is really what makes CESTA as special as it is. Everybody is welcome, everyone has a contribution to make.
Through this series, it’s been interesting to hear how little support CESTA receives from the university. I wonder if you could weigh in on this.
In some ways I think being at Stanford makes our work as humanists harder, and this has to do with Silicon Valley’s obsession with innovation and entrepreneurialism. That’s bollocks. For a start, there’s no such thing as out-of-the-blue innovation. Everything is built on the shoulders of giants, including the mistakes of those who come before us. But this drive for innovation, this drive for something new hinders the work of CESTA because, by this time, CESTA isn’t new. CESTA was, in fact, one of the first digital humanities centers in the world. And, in the world of new and shiny things being preferred, if something is not new and it’s rolling along and it’s doing very well on $2.50 and a Coca Cola, then why would you give it $1 million? But this is precisely the effort that should be funded and supported: it is a proven success that with decent resources would be unstoppable in what it offers scholars and students.
I was talking to my beloved colleague Alice Staveley yesterday, and we have lots of ideas for work going forward into next year, having to do with letter presses and print culture. And we’re both already at CESTA but I said to her, we’re going to have to give it some nice new acronym so we can go out and get funding. You have to get as much funding as you can. Because, after that you’re back to the NEH or another source of relatively small funding. Did you know Stanford takes more than half from federal agency funding like the NEH in overheads? That leaves the researcher with really very little money to operate on.
What about the endowment?
When you ask them, could we just have some money for x or y, officers will tell you “no because the endowment is earmarked for this, this, this and this.” And I get that, but then we get locked in this newness paradigm, the progress narrative, the teleological impulse of Stanford in Silicon Valley. These things are mutually constitutive. That’s the problem that we really face in our time and location and that detrimentally impacts really fantastic projects.
So digital humanities is in this weird space where it’s supposed to be the new thing, but it’s not new anymore. You end up constantly on this hamster wheel of repackaging it as new.
I gave a speech to the Board of Trustees, frighteningly, a couple of years ago. The then-president came up to me afterwards (he’s a neuroscientist), and he said, it’s very interesting to hear about digital humanities, but I can’t imagine there is a humanist who doesn’t already do digital work. I had a car full of students a little while back, I took them down to LA in November. And I was talking about digital humanities, and they were laughing, like, that’s ridiculous! We’re all digital. They’re born digital. So, I do think—all around—that the label “Digital Humanities” is a hindrance to others’ understanding of the research field and its significance. And no, not all humanists do digital work, by a long stretch of the imagination.
Has the so-called digital humanities changed the traditional humanities? I mean, maybe this is a false binary at this point and digital humanities is no longer a stable, separate category from the humanities.
Digital humanities is so diluted now as a set of practices, as a concept, as an understood phenomenon. I don’t know how useful it is to render effective the work of actual digital humanists. I think that digital humanities does itself a sort of disservice by the persistence of that label.
What is really brilliant about literary studies, arguably more than any other non-STEM field, is its constant redefinition of itself. You think, oh, we’re structuralist. Well, now we’re post-structuralist. No, now we’re something else, and on and on. We’re this, we’re that. We can move, we can transform. I think now is the time for digital humanities to transform because there is still so much excitement about it and yet funding for it in its present form is actually on the wane.
You’ve already touched on this a little bit, but I’m curious to hear what misconceptions about the field still remain, from your perspective.
One of the main misconceptions is quite an old misconception, but it still bubbles up occasionally. It’s about digital humanities taking away from or implicitly derogating traditional methods. And this brings me back to the idea of valuing professional humanities skills. You could be the most fantastic coder, but you still need the expertise to interpret the data. You need the domain expertise, the domain knowledge, to be able to apply your findings. Digital humanities involves the traditional researcher in another set of skills, but it’s another set of skills that still absolutely requires the domain expertise for interpretation. Expertise cannot be automated. Tell that to GenerativeAI and all its fans.