Artificial intelligence is beginning to reshape one of the most foundational areas of legal practice: contract law. In this episode, Jen Leonard and Bridget McCormack speak with Professor Dave Hoffman of the University of Pennsylvania Law School about how tools like Claude Max are influencing contract interpretation, legal research, and the future of agreements.
The conversation explores Hoffman’s research on law journals’ AI policies, the emerging role of agent-driven B2B contracts, and the growing challenges facing legal education. He highlights how AI can support more consistent and predictable interpretations of contractual language, while raising deeper questions about how lawyers are trained and how legal value is defined in an AI-integrated world.
Key Takeaways
- AI as a tool for contract interpretation: Large language models can help judges and lawyers assess the “plain meaning” of contract terms by analyzing the full context, offering a potentially more consistent alternative to traditional dictionary-based methods.
- Claude Max in legal research workflows: AI can aggregate, and then analyze, hundreds of policy documents into structured datasets, dramatically accelerating empirical legal research.
- The uncertain future of transactional law: As AI automates document comparison, drafting, and term analysis, the traditional value proposition of transactional lawyers may shift toward higher-level judgment and strategy.
- Agent-driven contracting and disputes: The rise of AI agents executing contracts introduces new complexity in dispute resolution, raising questions about how existing legal doctrines apply when algorithms—not humans—make decisions.
- Legal education under pressure: Law schools are rethinking curriculum, assessment, and skill development as AI challenges traditional models of legal reasoning, writing, and evaluation.
Final Thoughts
AI will not eliminate contracts—but it will change how they are written, interpreted, and valued. As these tools become more embedded in legal workflows, the challenge for lawyers, judges, and educators is not whether to adopt them, but how to use them in ways that enhance clarity, fairness, and trust in legal systems.
Transcript
Jen Leonard: Hi, everyone, and welcome back to AI and the Future of Law, the podcast where we explore all of the fascinating dimensions of artificial intelligence and what they mean for the law. I'm your co-host, Jen Leonard, founder of Creative Lawyers, here as always with the wonderful Bridget McCormack, president and CEO of the American Arbitration Association. Hi, Bridget. How are you?
Bridget McCormack: Great. Good to see you. It was great to see you at Legal Week this week.
Jen Leonard: I know—it was great to see you. I really loved your panel. I would love to do an episode about all the things and unpack them.
And we're really, really excited today to welcome one of the most preeminent thinkers in the country on contract law and around how contracts are changing in the face of emerging technologies—and in particular, artificial intelligence.
So thank you so much for joining us today, Dave Hoffman. Dave is a professor of law at the University of Pennsylvania Law School. His research is unusually interdisciplinary. He uses empirical data and psychology to explore how people actually behave when they make and break agreements. And we're going to talk a little bit about that today—and what happens when it collides with AI.
So thanks for joining us, Dave.
Dave Hoffman: Thanks so much! I'm super excited for the conversation.
AI Aha!
Jen Leonard: For those who've listened to the podcast before, you know that we have an introductory segment that we call our AI Aha! And, Dave, we hear from people all the time that they love to hear how our guests are using AI in their personal lives and their professional lives.
So we would love to kick it off by asking you what you've been using AI for recently that you think is interesting.
Dave Hoffman: So it's a great-timed question. About three weeks ago, the deputy dean at the law school gave a presentation in which he said, “You folks really need to install Claude Max. It'll change your lives.”
Until then, I had not. So I had been using just regular GPT.
So I installed Claude Max, and immediately I got to use it with a really fun project, which was—I’d been working on a Substack about contracts, and I decided to write a new post about how law journals—law reviews that law professors try to publish in—use contracts to control authors’ use of AI.
And it turns out, you know, there are several hundred journals out there, and I had Claude Max, through a co-work setup, crawl every single website it could find and download the AI policies, summarize those AI policies for me—whole PDFs—try to compare some language.
And I got, as a result of that, a really cool dataset that I then wrote up in the Substack to try to show how law journals were—and were not—responding to changes in how authors use AI.
So it was a really great project. It worked out pretty well. Claude Max only hung up twice, so that was a win from where I sit. And the ultimate product was pretty cool.
Jen Leonard: That's very cool. And I should mention you have a Substack called Contracts Empire.
Judges and Contracts
Bridget McCormack: I actually love your Substack. It's super interesting. As you know, somebody who runs an organization that tries to resolve a lot of contract disputes, I love the sort of expansive thinking about where contracts might go—especially your recent writing about AI.
And so I want to kind of start with your piece on generative interpretation and judges’ roles in interpreting contracts—and how that might change as a result of large language models, so that they could sort of become the workhorses that do some of the work that judges now do in contract interpretation.
Can you tell our listeners a little bit about that idea?
Dave Hoffman: Yeah, absolutely. So a couple of years ago—I think the paper came out in 2024, but we worked on it in 2023 and 2024, so many generations ago—my coauthor, Jonathan Arbel, who's currently at Alabama Law School, and I started to think about how it would be that these models would be useful for judges in the day-to-day work that they do to interpret contracts.
And the reason why we had that as a research question is because we know that a lot of what lawyers do when they litigate contract cases is trying to figure out the meaning of allegedly ambiguous contract terms.
And the technology that judges have to do that work is mostly the dictionary. I mean, it's a pretty good technology in that it's cheap, but it's a pretty bad technology—it doesn’t really help you answer any hard question. And there's always dictionary meanings on either side of a question.
And there's a lot of contract scholarship—unsurprisingly—about how bad judges are at getting stable answers to contractual interpretation questions. Because if you have any contract dispute, you can have good lawyers on either side, they can make very plausible arguments about what the parties meant about this thing they probably didn’t talk about.
So we thought, look—what are these models other than kind of prediction engines to try to figure out what text means in a majoritarian way? They're giving us a sense of what the predicted next answer is to a sentence. The sentence starts like, “The following word means X,” and it predicts it based on the other words that it sees.
So what we decided to do in this paper that eventually was published was to take a bunch of different real-life contracts that we had gathered from case files and throw them into models—lots of times, iterating with lots of different kinds of prompts.
This was when prompt engineering was a little bit more important because the models were a little bit less stable. And to show that what you got when you asked the model to predict meaning was actually pretty close to what the judges ended up coming to—but in a way that, at least we thought, would be cheaper, more accessible, and more transparent than the sort of dictionary-typing we see sometimes in practice.
We didn’t say that judges should give over their job to the models, but rather that this is something they could use to help to bias themselves—to make sure that they were not, for example, just going with whoever they want to win—and to second-guess themselves, check themselves, and ask: Am I sure I'm correct about this particular meaning? What am I not thinking about?
And the great thing about this, of course, is that you can have the whole contract in the model and use all of the different text and context. So you're not just looking at a particular word—you’re thinking about all the words the parties used.
So we published the paper. It got a bunch of attention—and of course, criticism. And we continue to work on the project of bringing this research forward now that we're in the next, next, next generation model, to try to understand how we can usefully use this technology to make contract interpretation a more stable, predictable, legitimate enterprise. That’s our goal.
Bridget McCormack: Super interesting. I have focused a lot on the sort of quantitative need for large language models in courts—especially state courts that just don't have the minutes to be able to give to every dispute in the way you might want from a branch of government that's charged with resolving disputes.
This focus is sort of a qualitative approach, and I love it. And I think the technology could probably do both. Have you heard from judges? Do you know of any judges that are wanting to experiment, have reached out, or are interested in the idea?
Dave Hoffman: So we know that one judge, at least on the Eleventh Circuit—Judge Newsom—used the technology and cited our article in two cases.
He got a lot of pushback, and I think that he sort of—at least as I read the work—was saying, “Look, this is how I get to my answer using the old-fashioned methods. Here’s using the LLM—and I can get to the same answer.” That gives me some kind of confidence or comfort in what I’m doing. It helps me calibrate my own judgment, which is exactly what we recommended.
I presented this paper to a bunch of different audiences, including audiences of judges. It’s interesting—they have exactly your intuition, which is: “Yeah, what you're saying is great, but I’m actually just struggling to manage a several-hundred-thousand-page record. And I really want basically a souped-up search engine.”
So: can you help me with that? Can you help me with document management? Can these models make the massive work that's in front of me—with the tiny resources I have—more tractable?
We’ve also presented this to lawyers, and I think lawyers are pretty interested in the idea of being able to use technology to get more predictable answers for their clients about what is likely to happen in front of a court.
Most of the time, what we’re aiming for is: what do the parties mean by this text? And there’s not a ton of evidence about that meaning—because if there were, we wouldn’t be disputing it. And our goal is to say: what is their “plain meaning”? What would they mean if you were just kind of some country lawyer looking at it?
And I think it’s very frustrating for lawyers and their clients to be in a position where, if you use one dictionary with one date and the third definition, versus another dictionary on a different date with the second definition—and mix in a Latin expression—you get totally different answers.So I think there’s some excitement in the community to try to get tools that can give more predictable answers.
But judges are worried about explainability. That’s my sense. When I pitch this to them, they say: “Yeah, but how exactly do I explain how the model got to where it is?” Because if my answer is the dictionary, I can tell you exactly what page. If it’s Black’s Law Dictionary, page 400—that’s an answer. If it’s a model, how do I explain it? So that’s the challenge we’re continuing to work on.
Bridget McCormack: My answer to that judge is—it feels like apples to oranges, because your colleague found that other dictionary and that Latin phrase and definition three. And the part I’m interested in is: how did your brain get from here to that particular dictionary? We never ask humans to show their work that way, and humans couldn’t show their work that way.
So what we’re asking of the technology is just—we want more, which is fine. I think it can give us more eventually. But do you think I’m wrong about that? I mean, I think the interesting part is: why did this judge pick dictionary A and the other one pick dictionary B? What’s happening in his or her brain that got them there?
Dave Hoffman: I think that’s a feature (or bug) of the legal system—that answers that we’ve come up with in the past are presumptively correct. And so that does protect us from some really bad, faddish decisions, because we don’t change particularly easily. But it also, of course, slows our ability to react to changes in technology that are useful.
So this is exactly the pitch. When I pitch this to law professors or lawyers, I’m like, yes—but compared to what? Compared to what is this thing that we’re doing right now that really has no sense to it at all, but is merely just the thing we’ve done in the past? Why wouldn’t we think about this technology?
And the answer could be that it’s just not accurate—that the models are not, in fact, accurately predicting how the majority of people reading that term would find meaning. And of course, we could not do surveys of meaning in every single case. There are some people who have pitched that idea—like, let’s interpret contracts using surveys—which is a great idea if you have infinite money and infinite time.
But state court contract interpretation doesn’t work that way. Even federal court contract interpretation doesn’t work that way. No one has the ability to field surveys. So what we’re trying to do now is actually just run a head-to-head of machine versus human surveys, because academics have essentially infinite time.
So we can try to do this once, and we’re trying to compare: how do LLMs—particularly modern iterations—fill gaps in contracts versus how individuals fill gaps in contracts? And try to show under what circumstances the LLMs approximate the majority answer, when they do better, and when they do worse.
So that’s the research we have ongoing. I think the judges’ implicit concern is not exactly that—it’s more like: maybe it’s just wrong, and I wouldn’t know. And so we want to give some comfort on that answer.
Jen Leonard: I have a question, Dave, on the underlying research process itself. You mentioned that the paper came out now, it is now eras beyond the models you were working with.
Bridget and I were talking offline about the paper that just came out of the University of Chicago comparing judges on sentencing decisions with LLMs. And they’re really illuminating and interesting.
But when you dig in, it’s GPT-4, and we’re in another world. So how are you thinking about the pace of the tech changing as you’re trying to analyze it and produce usable research?
Dave Hoffman: I mean, this is just going to be a challenge to research going forward. The things that we believe today to be true—by the time the publication happens—may no longer be true. And in five years, they really might not be true.
So a couple of things. One is you can try to design questions that are robust to changes in technology. So one thing academics can do is try to explain the technology thoughtfully—what the relationship is between law and this tech—and say: this thing that we’re doing now is an illustration of what we think is plausible. This is obviously not future-proof. It’s just an exemplar of the thing, and it’s going to get better. That’s what we said in this paper.
I do think the half-life of academic scholarship is collapsing. And that is bad for me. But of course, we have to live with change. I also think the cost of producing academic scholarship is going down. So one hope I have is that we move toward writing shorter pieces.
Producing a 70-page article takes more time than producing a 30-page article. So maybe we move to more of a continuous production model—shorter pieces, quicker interventions—and save the bigger work for non-empirical papers that won’t go stale as quickly.
But these are all things I’m struggling with, and I think a lot of my colleagues are struggling with: what are we doing? What’s the right way to contribute to society given the privilege we have in these academic jobs? Is it to write a paper that, in 18 months, looks obviously incorrect? That doesn’t feel right. People can read it—and the article still reads okay—but whether it will read okay in a couple of years, I don’t know.
Bridget McCormack: That’s super interesting—and actually exciting. First of all, amen to shorter law review articles, even if we didn’t have LLMs. But I do think there will be a better interaction between practice and the academy if pieces are shorter and more accessible.
Busy lawyers are just never going to read a 75-page article. They’re not going to do it. But I think they would read—if it’s in their area—a five-page piece. The way I read your Substack—I make time to read your Substack. I don’t know if I’m reading your 75-page article. I might read a summary of it.
Lawyers and Contracts
Bridget McCormack: Okay, let me switch to lawyers and contracts.
Where do you think this technology is going, and what’s it going to mean for lawyers who write contracts kind of for a living? Is it going to—maybe we don’t need contracts because the AI will be so good at determining our goals that we won’t need to write them down?
Like it’ll magically match up our goals so we all understand each other and never have disputes anymore. I’m not sure if that’s where your idea was headed, but what is it going to mean for lawyers who write contracts?
Dave Hoffman: So I would just start by saying: I don’t know anything. And the people who are in this field know a lot more than I do. Also, really—no one knows anything. The future is unclear. A real challenge for transactional lawyers is figuring out whether what they’re doing is valuable. There’s a lot of fussing about contract terms. There’s a lot of iterating on contract terms in transactional practice.
One worry that transactional lawyers I’ve talked to have is that AI is going to take a lot of the work that workers currently do, automate it, or allow it to be easily dehumanized. So if you’re trying to compare track changes across lots of documents, AI is definitely going to be better at that—or at least cheaper.
If you’re trying to flow changes across a really large set of documents, these models are going to be pretty effective. So in the short term, there’s a worry about taking the value that lawyers currently provide and destroying it. Lawyers’ attention to detail across document production—their ability to amass documents, keep track of them, be production engineers—all of that is declining in value.
On the other side, in high-end practices, no one really knows. They know that particular terms are important—but the price of the bonds or the debt doesn’t exactly reflect the value of the terms. Because it’s very complex—computationally extremely hard—to figure out what would happen in a future state of the world if the debtor can no longer pay and covenants are triggered.
So there’s lots of fighting about covenants—what’s the right setting of the term—but they’re not really sure how to value that service. A possible future is that these models help us get traction on valuing individual contract terms inside really complex agreements.
That would be really cool. It would show the value of lawyer production. It might also make it harder to charge for it. So I don’t know what we’d do for money—but it would help us get traction on this really important question: what is the value of what lawyers are doing?
For litigators, it’s easier—you win or you lose. You can tell a story about why you won. For transactional lawyers, the story about value is much harder to access and explain. So there’s a hopeful version of this: LLMs, because they’re good at organizing information and finding patterns, could help us do valuation.
We have these really well-articulated contracts that cost a ton to write—and most of the time, they’re never litigated. They just guide the parties. So one idea people have floated is: why not not write everything down? Instead, have a more informal agreement—call it legally binding—and then say: Claude Max, or whatever future version, is the compiler of our agreement.
Whenever we have a dispute, we feed it in and have it generate the term that resolves it. That would be a really big deal. It would cut a lot of money out of lawyers’ pockets. It would be enormously efficient. I’m not sure it’s ever going to happen—but it would be transformational. Not just for legal practice, but for the world—if we stopped relying on written contracts for commercial ecosystems.
Bridget McCormack: That’s super interesting. Another idea you’ve talked a lot about is how we’re all kind of drowning in unread form contracts. My question is: does AI make this worse or better? Like worse, obviously, because it can generate and distribute boilerplate, right?
Or does it make it better? Because soon—any minute now—I’m going to have my agent negotiating and executing with other agents. So maybe that’s better. Where does this go in the “we’re drowning in contracts” world?
Dave Hoffman: I want to resist the “better or worse” framing—I think it’s just different.
The consumer contract world—which I’ve definitely written a little bit about and thought a little bit about—the problem with that world is there are too many contracts. One possible future here is that AI agents help us navigate this world.
I don’t even know that I want that. I don’t want to have to tell some agent that I have particular views about class action arbitration waivers—which is really what we’re talking about in a lot of these consumer contracts. It’s not exactly the terms of the goods, but rather the terms of the litigation that follows when the goods are bad.
And it just seems unlikely that people are going to want to develop informed views enough to be able to tell their agent—or that they’re going to want to totally delegate to their agent—really any kind of decision authority about what happens on the back end of a lawsuit they don’t want.
It’s possible, I guess, but one of the reasons we live in the world we live in—that is, we have the contracts we have—is because the number of people who care about this particular issue is pretty small.
And it would be hard to imagine getting a coherent mass of individuals to care about whatever exculpatory clauses are in your services agreement for your refrigerator software. Maybe it’s possible, but it just seems relatively unlikely.
Of course, there are examples—like arbitration clauses—where you can get a political movement around it. But most of the time, you can’t. On the other hand, one vision of this is that instead of having form contracts that are undifferentiated across everyone, you can have individualized contracts for you.
And my view is: that’s possible. But again, I don’t really want to. I mean, I don’t read my contracts. I don’t recommend anyone reads their contracts. I just think that I have better things to do in my life, especially when I can’t actually negotiate.
And the idea that firms are going to provide the ability to negotiate with agents with respect to consumer goods seems to me to be really unlikely—unless there was consumer demand for that exact service, which I don’t think there is, because you’d have to pay for it.
And so what I think we’re going to get is just more contracts. That’s the world we’re walking into. For the few remaining parts of your life today that are not governed by contract—that is going to be governed by contract going forward. Which is why my Substack is Contracts Empire. It is imperial. The thing that we’ve got going on here has no natural stopping point.
And the solution to things like that is legislation. We should, if we care about this, carve out certain areas of social life—places where you can’t have a contractual relationship, where you can’t do things through written contract. The example I have is: I’m certainly of an age where I used to take lots of cabs. And when I took cabs, I used to pay with a thing called paper cash.
And when I did that, there was a contract between me and the cab driver about certain things, but it was mostly just default terms—what you have in regular commercial life when you buy a service. Now when I have an Uber, I’m agreeing to all kinds of things.
And the question is: what is the advantage that we have as a society from contractualizing—through written contracts—what used to be unwritten agreements?
And I just think that that logic isn’t stopped by AI. It might not even be touched by the LLM revolution. Although I guess, in some ways, there’s a possibility you get even more contracts because it’s just cheaper for them to be produced.
Jen Leonard: But could it also be, Dave, the combination of the volume and the imbalance of power?
Because I actually find that I do “read” more of my form contracts now—I take them and run them through Claude and say, “Is there anything in here that I should be aware of and not agreeing to? And if there is, what am I going to do about it?”
But I still have to agree. It’s a contract of adhesion, right?
Dave Hoffman: That’s definitely why I tell people not to read them. There’s a lot of sadness in the world. I know what’s in those contracts—it’s so bad.
Sometimes it’s surprisingly bad, but mostly it’s just ordinarily bad. The contract tells me they can do a lot of bad things to me, and I have no recourse. If I wanted to, I could read that 50 times a day—but I would just rather read sci-fi.
And if Claude reads it for me, I’m going to feel even worse, because it’s going to first compliment me—“What a great idea for putting this contract in there. You were really careful and thoughtful.”
And then: “It seems like you’re really in trouble. Here’s a 19-point plan for us to solve this problem together. Step one: call Tim Cook and tell him you’d like to have a different contract.”
Bridget McCormack: Dave, we mostly were talking about B2C contracts, and I agree—we’ll probably have more of them and I will continue not to read them.
But I have been reading more and more, and following—all the consultants are telling me—B2B contracts are going to be executed by agents. Agents are going to execute contracts with other agents. And there’ll be a few more stakeholders involved, because there are payment systems that are automating into that B2B, agentic contract.
And then there’s the MCP layer, and then there’s the frontier model companies. So I’m already thinking about when there’s a dispute—because agents actually don’t always behave—and if they misbehave and there’s a disagreement, with more stakeholders, it’s a slightly more complicated dispute resolution process. Does it just map onto historical contract law, or do we need lots of new, fun contract law?
Dave Hoffman: I love the idea of new contract law, so I love your optimism—that sounds amazing. People have been thinking about this for a little while in the law review literature—but they’re all 70 pages, so no one would have read them.
So there are two different streams of this. One of them was when you had the first XML contracting, where people were trying to do commercial relational contracts—basically database purchasing systems. People had a bunch of papers on it.
And then in the algorithmic era, pre-LLM algorithm era, people had some conversations about algorithms entering contracts. There’s just not a ton of case law, in part because agents are pretty good a lot of the time.
The one case that I love—and I would recommend to anyone who’s interested in what’s coming in this space, and which I taught for a couple of years before my contract casebook coauthors made me take it out of the book—is a case called B2C2 v. Quoine, which comes out of the Singapore International Commercial Court.
Unsurprisingly, it’s an arbitral court, and they’re basically trying to adjudicate—at the core of it—it’s a contract to buy and sell crypto. A person wrote an algorithm several years before the case happened that said: buy and sell crypto under the following set of conditions.
On the day of the contract, there was an error in the software that allowed the sale of crypto to be well under the market price. If two humans had met that day, this would have been obviously a mistake—to sell and buy at that price. The algorithm was like, not a mistake—I just have a program, I want to do the thing—and it executed the trade.
The platform then seeks to rescind, saying this is just a mistake. It’s a unilateral mistake, which it would be in a human-to-human context. If I were selling you something for one-hundredth of its price, you would have constructive knowledge that it must have been an error. And I think most of the time we would excuse the contract under that condition.
And the court really uses all of this old case law to try to decide how we should think about what the algorithm does to the question of whether or not we should permit rescission under these circumstances. It’s really good. I mean, it’s a really good decision that should resonate with American lawyers.
And then on the back end, after it basically says that the algorithm creator wins—on the theory that they didn’t know at the time they created the program about the error—the next question is: is Bitcoin a “good” that gets to be rescinded with specific performance, or do we turn it into its monetary equivalent at the time of sale?
So it’s got all of these amazing contract doctrines. It’s a really cool case. And I do think that’s what courts are going to do going forward—look for old cases that provide analogies, and then apply those analogies to try to come up with answers that fit within our existing doctrine. Because courts are not revolutionaries.
Bridget McCormack: That’s so interesting. And now I can’t wait to go read it.
Legal Education
Jen Leonard: Shifting gears, Dave, from the land of contract—which sounds like under upheaval—to another world that you occupy that may or may not be under upheaval: law schools, where you live and work and think, are obviously struggling to figure out what the AI era means for their students, for what they teach.
So how are you and your colleagues and others across legal academia talking about this, trying to plan around something so shapeshifting and elusive? And what are you thinking about what needs to change, if anything?
Dave Hoffman: In every conversation I have, at some point people are chatting about technological change. I don’t think it’s like we have our heads in the sand—or at least the people I talk to don’t. We have a regular sort of workshop series about AI and law. And we have regular newsletters that are coming almost weekly.
I would want to resist something I sometimes see on social media, which is that law schools are going to be behind the curve here. We might be behind the curve, but it’s not going to be for lack of thinking.
All right, there are a couple buckets of things. One of them is: how do we prepare our students to practice law when the practice of law is evolving so quickly? And so that, in some ways, is a curricular question.
To what extent do we want to teach skills—sort of like prompt engineering skills? That was sort of maybe six months ago or a year ago. Do we want to think about incorporating exercises using AI research into regular day-to-day practice inside the law school? How should our writing classes—our legal practice skills classes, our legal writing classes—permit, forbid, encourage, support the use of AI in writing? These are all pretty open, hard questions.
Second, we talk a lot about evaluation. So when I was in law school, we had blue books. Then when I started teaching, which is now 20 years ago, most of my colleagues, over time, moved to take-home exams on the theory that nothing ever happened in three hours. And so, wouldn’t it be better to have an open exam where you’re really testing judgment?
And that, in the last two years, has virtually disappeared from the law schools that I’m aware of, because people just don’t know how to give a take-home exam that is not horribly intertwined with AI. So no one wants to basically test: how good are you at querying Claude given this fact pattern?
And we know that the models are capable of producing perfectly good, if not great, answers—even when you work really hard to distract them. And so lots of people have moved to in-class exams, and in-class exams with internet disabled—maybe not even your computer, so you can’t host your own local LLM.
And some people have suggested: how about oral exams? Because oral exams are just purely in your brain. At some point, you might try to ask yourself: what are we doing? What are we testing for? Are we testing for your judgment in some silo—which is sort of like your IQ, and what you were able to memorize, and how you perform under pressure?
That’s not exactly the same as how good a lawyer you are. I mean, it’s not unrelated to it, but it’s probably not exactly the same. And it’s certainly not going to be the same as how good a lawyer you’re going to be in ten years, when these models are everywhere—or five years—and they’re part of the day-to-day practice. And so we’re really struggling.
I don’t think anyone has a really great answer to the question of: what is the relationship between what people get inside of a law school building, how do we evaluate what they’re getting, and how do we make sure that they’re really good, ethical, productive, happy lawyers?
And then finally—and like, most pedestrian of all—AI grading. You both have worked in law schools, and there’s nothing law professors like to complain about more than how horrible their lives are for the one week a year that they grade. Because grading feels really, really high stakes, and it feels a little inexact—particularly if you don’t give a multiple-choice exam.
And so there are a bunch of papers out there that say AI is really good at providing replicable grades. It can do a really good job of reading this text, and if you give it a rubric, it can tell you pretty well what you would do if you were to read this in your best frame of mind with a really good cup of coffee—as opposed to what might be happening, which is you’re reading it at 4pm trying to finish them all, and your kids are yelling at you from the next room.
And so there’s been a lot of conversation about whether, and to what extent—I mean, the ABA rules don’t, I think, actually permit AI grading right now—but whether, and to what extent, you can use these tools to make our evaluative processes fair on the front end and fair on the back end. And there are no answers to these questions.
They’re just really, really hard challenges to a mode of legal education that we’ve been delivering for maybe a century and a half at this point—and has worked pretty well for a lot of people to produce a lot of really good lawyers for a really long time. And it feels like it’s shifting pretty quickly.
Jen Leonard: You mentioned skills, Dave. And Bridget and I spend a lot of time with law firm partners who are talking about law schools and their need to produce AI-skilled attorneys. Do you view it as a skills issue, or is humanity undergoing a fundamental change in the way that they interact with an alien intelligence that requires reimagining the way we think about cognition and learning?
And I’m listening to all the sort of gymnastics around trying to retain the old way of assessing, and the more I think about it, the more I feel like we’re going through something bigger than that. But I don’t know whether you agree with that. And if you don’t, that’s the end of the conversation. But if you do, what would an approach be?
Dave Hoffman: Well, on one thing, I can just give you a rote answer. On the second one, you’re like, “and now tell me about the future of humanity.”
All right, so this conversation about lawyers wanting law schools to do X repeats endlessly over the last 40 years. It’s like: why aren’t law schools producing practice-ready graduates who can take a deposition? Why aren’t they producing practice-ready graduates who know how to manage a secured transaction?
Sometimes it’s like offloading training, which the law firms don’t feel like they want to do. And sometimes it’s like law firm partners have a sort of romanticized vision of what their actual law school was like, or have a vision about what they think could be accomplished within the building.
Law schools have never been—really have never been—spaces that have real practice learning, because practice changes. Like, what the nature of the practice changes. We have tried over time to be responsive to the needs of the market, and we will definitely continue to be. I have heard law firm partners say, “I want lawyers to be AI-ready.” And I’m like, what do you think that means, actually?
Like, what do you have in mind? Is it that you think that you’re going to get some graduate who’s going to know when there’s a hallucination in the file and prevent it? Are you saying that you want them to be able to iterate with Claude so that it works on the draft of a brief?
I think “the thing” is different almost every week, because the technology is changing really rapidly. And the idea that we could be in the business of chasing that car feels very challenging, given my understanding of how slow law schools actually evolve to do anything.
And so I kind of think the second thing you said—which is that this is a really big change to how humans interact with knowledge, how they produce it, how they consume it—and law schools are part of that big change, like universities are part of that big change. And we have to really be careful and thoughtful about thinking: what is our core thing that we’re doing in the building, and what is the thing that we’re preparing our students to do?
For many students, my experience, the core thing that we were doing was teaching them how to read really carefully and engage in some sort of prediction about how other people are going to act in response to legal texts—which is legal judgment.
Legal judgment is a combination of really careful reading and empathetic imagination about what the arguments are going to look like when you present people with risks. I believe that some of that is work that’s going to be done by LLMs—like, the prediction parts of that task are delegable.
I just don’t know what the world looks like when that happens, because there’s so much other change that will occur at the same time that I just—I’m not sure that I understand exactly all the changes that we have in front of us.
Another part of what we do, of course, is counseling—is working people through how to understand materials that we understand. And I think that that actually is going to be increasingly important as the amount of information we have in front of us increases.
Making sense of the world, helping navigate complex structures, isn’t obviously something that can be done by a machine—although some of the text explanation might be. But the navigation, the discussion, the thinking—I believe that’s core to our mission as well. I mean, I just don’t know what the future will look like. But I think it’s downstream of the future of law firms.
Bridget McCormack: So, Dave, I don’t want to get too meta on all of this, but in answering Jen’s question, you told us what it is you’re doing in the building—or what we have been doing in the building—and I think I agree with you.
And it’s so focused on the one-to-one service model, which, in the old days, made sense because there was a lawyer for every person with a legal problem. And that’s sort of what a lawyer was—an individual counsel or an individual problem solver, whatever that looked like.
And lawyers have also been struggling for the last five decades with how the profession has kind of—or the legal needs of society have kind of—outgrown the one-to-one lawyering model. Is your answer about what we’re doing in the building—do you sometimes think, like, maybe we need to rethink what we should be doing in the building?
Are there other things we could be doing besides training law students to serve in a model that might not match the needs of the public anymore? Or—I don’t know that that’s necessarily true—but it’s a thing I think about sometimes.
Dave Hoffman: There’s no better mark of a really well-trained lawyer than someone who asks a question for the answer of which has to be yes.
Obviously, I basically agree. The world has gotten quite complex. The nature of the legal problems that we have is harder than it used to be. And we are not serving the public in ways that I think they recognize as fair and equitable. And part of what we should be doing is responding to the unmet needs that—the mass provision of justice, which is sort of what we’ve got.
To be real concrete, I’ve spent a bunch of time—and this is just a small diversion, not exactly about AI, but it’s related. I’ve spent some time thinking about residential leases. So I did a project gathering information about residential leases. I found that lots of those residential leases contain terms that are unlawful under Pennsylvania law.
We collected several hundred thousand leases in Philadelphia, and I tried to think: why is it that our justice system seems to permit tens of thousands of our neighbors in Philadelphia to live under leases that the justice system says are unlawful?
And obviously, part of the problem is that courts can only respond to the things in front of them. And the municipal court in Philadelphia is overwhelmed. It just doesn’t have the capacity to clear the system of junk. It just can’t. And it’s not their fault. They are provisioned poorly, but they’re also reactive. They’re not an agency who can go out and do stuff. They have to sit there and wait for arguments to be presented to them.
And they’re not like a lease police—who go and check things for terms that are bad. They announce the law, but they can’t enforce it in that way. So what can we do? And I think the one model of this, coming out of the traditional public interest work, would be to sue some people.
And people try that. They try to sue people for bad terms. And I’m real skeptical about that kind of retail provision of public interest work. I mean, I think it feels good to the people who do it, and it’s important work, but I don’t think it has systemic effects, always.
So I tried something pretty different. I just drafted a model lease, because I thought really what was happening a lot of the time is landlords are really poor, and they would like a lease. And so they look online for leases, and they pick the one at the top of the Google search, and then they use that.
I don’t think the landlords are evil, who are looking to violate the law. I don’t think they care. I think they’re looking for a self-protective lease that is the first thing on Google that’s free. And so I wrote a lease, and I made it free, and I SEO-optimized it so it’s the number one thing on Google if you search for a Philadelphia model lease. And we’ve got a lot of downloads.
And I just think that that’s just a different way to think about—ultimately—how to use law to benefit people and society. And to do so in ways that are robust to the mass production that we actually are living in.
I don’t know how AI interacts with that, and I don’t know how law schools should think about all of the things that are in front of them right now—all the change that’s in front of them. If I thought I could go to Claude and say, “How should law schools change in order to use you to do this thing better?” I would.
But all I know is what I would get right now is, like, “Dave, what another insightful question from you—crushing it with these insightful questions these days—and here’s nine different ideas that aren’t going to work.” So I don’t know. I don’t have answers.
Jen Leonard: Setting aside the AI challenges, it sounds like you think there’s a role in law schools for pushing entrepreneurial and innovative thinking about new models for delivering the law to the public.
And do you also think there’s a role for more systems-thinking education—for students to actually understand how the law is delivered, or not delivered, to the public—in the core curriculum?
Dave Hoffman: It’s a hard question to answer, in part because I think different law schools are doing different things. And I don’t think there’s just one model for how we deliver legal education—though the ABA has tried to suggest that we ought to have just one model.
I think that whatever one would say about why the ABA is being challenged right now in lots of states on its accreditation. The hopeful thing that I believe is, that it would be good for us to try to experiment with this thing that we’re doing.
It would be good for us to try different models of delivery of legal education. It would be good for us to try to see whether there are people who have—I mean, people are currently—how do we deliver systemic change? How do we think about entrepreneurial education? How do we encourage lawyers to work in interdisciplinary ways with other kinds of providers of services?
They are mostly on the margin, I would say, right now. And they’re on the margin in part because of the regulatory apparatus that we work under, and in part because lawyers are really conservative—and want to see a precedent that works before they really invest wholeheartedly in sort of the argument.
It would be great if we saw ourselves more as risk-takers. Whether the incumbents who are in the buildings are the best people to be taking risks, when people like me have been rewarded their entire lives for not taking risks—and therefore have learned a really important lesson about how risks are scary.
Jen Leonard: Thank you so much for spending your time with us, Dave. We really appreciate it. I’ve learned so much.
Bridget McCormack: This has been such a great conversation.