Can AI Make Lawyers Better? Daniel Schwarcz on AI and Human Legal Reasoning

Can AI help lawyers learn—or does it weaken the skills legal training is meant to build? In this episode, Jen Leonard and Bridget McCormack are joined by Daniel Schwarcz, Professor of Law at the University of Minnesota Law School, to discuss new empirical research on artificial intelligence and human legal reasoning.

The conversation explores a randomized control trial studying how law students used AI to synthesize legal materials, apply legal rules, and revise legal analysis. Daniel explains why the results surprised the researchers: students who used AI early in the process performed better later, even when the AI was taken away. The episode also examines the risks of AI-assisted revision, the long-term development of junior lawyers, and what law schools and law firms should do next.

Key Takeaways

Lawyers are trained to use AI well: Legal training teaches lawyers to ask follow-up questions, probe ambiguity, test hypotheticals, and challenge answers—all skills that can help users get better results from AI.

The study tested a core fear about legal learning: Daniel’s research examined whether students who used AI to synthesize legal materials would struggle later when asked to apply those rules without AI.

The results challenged the hypothesis: Students who used AI in the first stage performed better later because the tool helped them build a stronger understanding of the legal rules.

AI can introduce revision risks: Stronger writers sometimes perform worse after using AI to revise, suggesting that fatigue, time pressure, and over-deference to AI can undermine precision.

Legal education needs a layered approach: Daniel remains cautious about first-year law students using AI too early, arguing that core lawyering skills must be developed before students can use AI effectively.

Final Thoughts

AI is not simply changing how legal work gets done—it may also change how lawyers learn. Daniel’s research suggests that AI can support legal reasoning in some contexts, but only when lawyers use it thoughtfully, check its work, and preserve their own ability to evaluate quality, accuracy, and judgment.

Transcript

Jen Leonard: Hi everyone, and welcome back to AI and the Future of Law. I’m your co-host, Jen Leonard, founder of Creative Lawyers, joined as always by the wonderful Bridget McCormack, president and CEO of the American Arbitration Association. On this podcast, we explore the rapidly changing capabilities of artificial intelligence and what they mean for the legal profession.

Today, we’re thrilled to be joined by Daniel Schwarcz to talk about some recent research that he and his colleagues have produced around AI and the cognitive development of law students in particular. But we’ll get into that in a minute.

Daniel is the Fredrikson & Byron Professor of Law at the University of Minnesota Law School. Bridget and I have good friends at Fredrikson, including Nora Olsen Bluvshtein and Anne Reinhart, so we feel even more connected to you, Daniel, above and beyond our shared interest in legal education and AI.

We’re going to talk eventually about your research, but we really want to get started by hearing about how you, as a human being, are using AI in our AI Aha! segment. So, what’s something you’ve been using AI for, Daniel, that you find particularly interesting?

AI Aha!

Daniel Schwarcz: Yeah, well, thanks. It’s great to be here. I use AI for almost everything to help me. But what I would say has been an aha moment for me is something a little bit more general. What I’ve realized is that the way lawyers are trained, in some ways, implicitly or indirectly, is to use AI well.

What I mean by that is that lawyers are trained to ask follow-up questions, to clarify, to identify ambiguities inanswers, to probe those ambiguities, to ask for more precision when appropriate, to ask about hypotheticals, and to ask about counterarguments.

All of those skills that lawyers are trained to deploy in the ordinary course—whether it’s arguing in front of a judge, in a deposition, or what have you—are actually the perfect skills to use AI well. Deploying them when you’re using AI can help you get the best out of it.

So, in some ways, I think many lawyers have this sense that they don’t know how to use AI well. They’re not software people. They’re not technical people. And so, they think they’re limited.

But if they embrace the idea that the training lawyers have, just in terms of dealing in the adversarial process, is in some ways the exact skill set you need to get the most out of AI, that should hopefully embolden them and motivate them to start working with AI more—and using those skills to get the most out of it.

Jen Leonard: I love that. Bridget, we’ll have to update our presentations, because we always talk about how lawyer personalities set us up to be the exact opposite of what we need. But I love this framing.

Bridget McCormack: Yeah, I completely agree. Actually, I think it’s also true that lawyer personalities are ill-fitting to this technology, which is probabilistic and changing all the time. But it’s nice to have the counterpoint that there is something about our training that makes us well-suited for how to use it well.

I think that’s what you’re saying, and I completely agree with it. I hadn’t thought about it that way before. So, yeah, I love it. I think we have one more slide to add to our standard presentation.

Daniel Schwarcz: Fantastic. That’s what I like to hear.

Testing AI and Human Legal Reasoning

Jen Leonard: So, we’ll dive in now to your new paper and the underlying research that you and your colleagues have released, called Artificial Intelligence and Human Legal Reasoning.

Bridget and I were exchanging this over text. Actually, Bridget brought it to my attention before I even got to see it, when it was hot off the presses on SSRN. We’ve been thinking so much about what AI means for the cognitive development of junior lawyers in particular, and whether it will impact their critical skills negatively. And this paper really started to shed some light on what’s actually happening with new lawyers.

Could you tell us briefly about what the research entailed, what your methodology was, and what hypothesis you and your colleagues were trying to test?

Daniel Schwarcz: Yeah, absolutely. As you suggested, I’ve thought for a while now that the most important risk associated with AI is impairing the cognitive judgment and development of lawyers. There’s been a lot of talk about that, but there hasn’t actually been much empirical evidence that I was aware of.

I had done previous empirical work looking at what happens when you give lawyers, or law students, AI compared to when they don’t have AI. And we were finding what you would expect in terms of huge efficiency gains and even, increasingly, real quality gains. But my sense was—and my sense remains—that many lawyers and law firms are nonetheless reluctant to embrace AI because of the risks.

And that makes sense. There might be a technology that is helpful in many ways, but if it comes along with risks, you still might not be interested in it. There are a variety of risks. There are hallucination risks. There are confidentiality risks. But in my mind, the biggest risk was always this cognitive decline. So, I wanted to study it.

The way we thought about studying it was to use a randomized control trial, which is something I’ve done in the past to measure the effectiveness of AI. I want to be clear: there are limits in terms of what we’re able to measure. But our basic approach was as follows.

Before I describe the randomized control trial, let me just describe the tasks that all our participants completed. There were four tasks, all associated with one assignment.

First, we gave our participants some legal materials—a limited closed universe of legal materials—and asked them to read those materials and synthesize some of the key legal rules. The second stage was that we had our participants answer objective multiple-choice questions about those materials. The third stage was that we presented our participants with a hypothetical situation that required them to think about how the rules and the materials they had looked at would apply in that hypothetical situation. And then the fourth part was that we had our participants revise that memo to make it perfect for a client.

So, what was the randomization—the control and the treatment? We randomized whether or not our participants had access to AI and were using AI at stage one, when they were trying to understand and synthesize the materials.

Half our participants were instructed to use AI and were told how to use AI in a way that we thought aligned with best practices. And half our participants were not allowed to use AI. They had to use the conventional approach of actually reading the materials and trying to summarize them for themselves. After that, no one used AI for stage two, the objective questions, or stage three, the application. And then everyone used AI for stage four, the revision.

So, the core hypothesis we had going in was motivated by some research outside of the legal domain showing that when people use AI, they don’t engage as carefully with the underlying materials. Their brain is sort of coasting.

Our prediction was that even though the participants who had access to AI in stage one would produce better materials as a result, once we took the AI away—particularly in stage three, when they had to think about this novel situation and how the rules would apply—the participants who had used AI would be at sea.

They wouldn’t have internalized the materials as well. So, they would perform less well than the participants who had been forced to confront the materials with their own human brain, unaided by AI.

That was our prediction: the participants in the treatment group who used AI in stage one to help synthesize the legal materials would perform less well than the participants who did not have AI at stage one, once the AI was taken away, because they would not have developed as sophisticated an understanding of the materials by virtue of having used AI.

Bridget McCormack: In other words, you need your human brain to actually do that hard work at stage one in order to perform well at stage three. You need to really internalize it.

I think that’s what we all kind of assume is how it’s working. We still use AI because it allows us to do so much more than we could otherwise do. And there are some times where we don’t really care if our brain fully internalizes it. But for this kind of project, we might care very much whether the lawyers fully understood stage one, if it meant they were going to perform better at stage three.

We obviously want to hear the results and whether they differed from your predictions. But can you say more about what AI they used and how? You said they were told specifically how to use it. Can you say more about that?

Daniel Schwarcz: Absolutely. We used Gemini 2.5 Pro, which was made available to all University of Minnesota students. Our participants were 2L and 3L students at the University of Minnesota.

In my past research, which just looked at what happens when you use AI versus not, we had mostly used OpenAI’s ChatGPT. So, this was different in that regard. It was a frontier model at the time we did the research, which was in fall of 2025. It’s very hard to keep track of things, but it was a pretty good, capable model even by today’s standards, I would say, even if it’s not frontier anymore.

In terms of how we instructed them to use it, remember; we had a closed universe of materials. So, we instructed them to upload those materials and prompted the AI to synthesize them and come up with a rule. Then they were instructed to read those materials and check the accuracy of the AI output, to ask follow-ups of the AI to the extent that any portion of the rule was not clear, or to the extent that there were details they felt might not have been fully fleshed out by the AI.

That was the kind of model I had in mind when I think about how associates use retrieval-augmented generation tools like Westlaw, Lexis, or Vincent. The AI produces a summary memo, but then you can ask follow-up questions. And what you should do—what you absolutely should do—is check the underlying sources, make sure they’re accurate, see if you need to add things, see if you need to change things.

That was the approach we were trying to replicate in stage one, using an AI tool that was like the ones that are commercially available, to replicate that process. We didn’t want to tell them to use Westlaw, because that would produce a lot of variation in terms of the sources they came up with. So, we tried to replicate that within a closed-universe setting.

Bridget McCormack: So, you chose the frontier model over Westlaw on purpose because Westlaw would have produced too much variation in the results? Is that generally true? I don’t use Westlaw AI much, so I’m not knowledgeable.

Daniel Schwarcz: Yeah. The difficulty is, yes, because it’s a RAG model. It’s an AI model. You’re going to get different results when you use it, and then you’re going to get a ton of different results. We were also constrained by our experimental setup. We needed to have an experiment that worked within three hours because of practical difficulties.

If our assignment had been, “Go ahead and research this issue using Westlaw,” that would have been a much longer assignment. There would have been a lot of different sources. There would have been variation in the sources. So, we wanted to control all of that.

One of the things whenever you’re doing an experiment in randomized control trials is that there is artificiality involved, because you’re trying to control a lot so you can get at the causal mechanism.

That produces internal validity, which is great, but it always comes at the cost of external validity—the fact that, at the end of the day, there is some artificiality involved. We don’t think that undermines our results, but it is an inevitable tradeoff when you’re trying to structure an experiment to get at a core hypothesis.

What the Study Found

Bridget McCormack: What were the results, and did they differ from your predictions?

Daniel Schwarcz: Yeah. On this core issue, our result was the exact opposite of what we hypothesized.

What we had hypothesized—and we had pre-registered our hypothesis, and frankly, designed the experiment to test this—was that the participants who had used AI in stage one to understand the legal source materials would perform less well once the AI was taken away and they were required to apply those legal source materials. We found the opposite.

What we found was that the participants who had used AI in stage one performed better than the participants who had not used AI in stage one, even at stage three, when neither group had access to AI. So, this was pretty surprising to us. But I think it’s very clear what the mechanism was.

The reason it’s clear is because we’re able to look at the results in stage one. Consistent with past results, the participants who had access to AI produced better syntheses of the underlying legal materials than the participants who didn’t have access to AI. If you control for how participants performed in stage one—when they were synthesizing the materials—the effect goes away.

In other words, what was happening is that AI allowed people to better synthesize and understand the legal rules. Then that produced carryover effects that benefited them even when the AI was no longer available. So it was that effect—AI allowing them to better understand the legal source materials and better understand the rules—that helped them when the AI was no longer available, and they were applying those rules.

In retrospect, intuitively, it kind of makes sense. You have to have a very good understanding of the law and of source materials if you’re going to actually apply them. And if you have a good mental model for how the law in a particular area works, that is going to really advantage you later on when you have to answer questions about it and apply it.

Indeed, AI helped produce that in our participants in the experiment. So pretty surprising results for us, at least in terms of what we were going in with, but results that I think also intuitively make sense.

Jen Leonard: I know we’ll get to the later part of the activities in the trial, but I’m curious, Daniel, because I have been thinking a lot about this without a clear answer.

Bridget and I get these questions all the time when we present. I think it is the red blinking light for a lot of people in the profession: how is it possible to learn the law without doing all of that underlying work?

So, what does your experiment tell the profession about that pervasive fear? How should we be thinking about that?

Daniel Schwarcz: I wish it would be, “Don’t worry about it,” or “Do worry about it.” But we have to be appropriately limited. This is one experiment. And I think it’s an important experiment, frankly, because I think it’s the first empirical result. But it is just one experiment. I think there are a lot of things that we can say and a lot of things we can’t say.

What I would say is this: the experiment makes clear that, in some circumstances, using AI can help you understand complicated materials, and that can then produce carryover benefits. What it doesn’t show is how common those circumstances are, as opposed to the opposite. It’s entirely possible that if we had structured our experiment differently, we would have gotten our hypothesized results. Most importantly, it can’t say what the long-term effects of using AI are.

In my mind, this is actually the biggest risk. As a law school professor, I worry a lot about this. At the end of the day, the main thing I’m trying to teach my students is not tort law or contract law, because they’re going to forget most of that. It’s to develop the skills to learn those laws, because those skills have long-term benefits. I do think there’s still a lot of risk and uncertainty associated with people using AI so often, or so much in the wrong way, that they undermine their long-term skill development.

So, I still don’t think we have any data on that. I still think it’s a big risk. But at the same time, what we were really focused on was more of the short-term risk, which I still think is a big risk. And it’s one of the things that I think is actually causing reservations in much of the legal field. To be honest—and to be a little bit cynical—many law firms don’t care about the long-term development of their lawyers.

Jen Leonard: Well, I was just going to say that. For firms, and honestly for junior associates too, a lot of the concern is: will they be able to have work to do if we can’t bill for the whole timeline of them learning?

To me, it feels a little bit optimistic for the junior associates, in the sense that maybe the business model can adjust so they’re able to get up to speed on the law and contribute to the firm’s work long-term. I’m not really sure, and obviously you don’t know, but what do you think about that? Is it a positive story for juniors?

Daniel Schwarcz: Yeah, I do think there is a positive story—again, being appropriately limited in what we can say based on the evidence. But I think the positive story is that AI can actually allow junior associates who, without the aid of AI, might struggle to understand or apply certain materials, not only to produce better work, but then to be better advocates and lawyers when they don’t have the AI.

It can allow them to be better in oral arguments. It can allow them to be better in client meetings. It can allow them to be better when they have discussions with partners. It can allow them to see connections they might not otherwise have seen. That was one of the things I really worried about going into the experiment, and that motivated my hypothesis: that people would use AI to shortcut and maybe produce something that looks good on paper, but then miss the connections and not be able to explain things.

I’m sure we’ll talk at some point about the practical lessons and how to use AI appropriately. But if used appropriately, there is a lot of potential for AI to make you a better lawyer in the short term. I do think that’s a very optimistic story, and at least one potential story our study is suggestive of.

Bridget McCormack: One of the interesting things about the study was that, during the revision phase, the stronger writers submitted poorer work when they were aided with AI, while the writers who didn’t start with a strong human draft actually improved.

Do you have a theory on why that occurred? And what does it mean if you’re a supervisor at a law firm? Do you have to assess whether your associates start as strong writers or weak writers, or somewhere in between, to determine how much AI they should use in their revision process? It feels hard to sort out if you’re trying to help junior lawyers. What do you make of that particular finding?

Daniel Schwarcz: Yeah, you’re absolutely right. This was another hypothesis we had that was not quite borne out. We expected that in stage four—remember, in stage three, no one used AI, and then in stage four, everyone used AI—we hypothesized that if everyone used AI to make their work better, it would make everyone’s work better.

We certainly thought it would have stronger effects for those who had done less good work initially, because there would be more fruit to pick in terms of making improvements. But if I say, “I’m going to give you this really powerful tool to help you make your work better,” you would think it would make everyone’s work better. And we didn’t find that for the strong participants. We found that for people who had done quite well without AI, giving them AI at the very end actually made their work worse.

Now, what explains that, and what can we learn from it? This is now speculation. I can’t say for sure. Unlike the first result, where I think we have enough data to explain it, here we really don’t. But what I believe to be the case is that stage four was at the end of the experiment. It was a three-hour experiment. It was intensive. Our participants were under some stress because they had financial incentives to do well. And they only had about 20 minutes at the end to use AI to help them.

I think what happens—and this is consistent with my own experience—is that when you’re using AI and you’re tired and don’t have a lot of time, there’s more risk involved. That’s particularly true when you’re using it to revise something that you thought very carefully about. One of the things that can happen—and again, I’ve had this in my own experience—is that AI will subtly shift your words or your language in ways that undermine or make ambiguous your meaning, or undermine your precision, in ways you might not catch initially.

It’s very easy when you’re reading your own writing to overlook ambiguity or imprecision because, in your head, it makes a lot of sense. So, I think what happened is that the participants who had done a really nice job initially were cognitively tired, working under time constraints, and deferred too much to the AI. The AI actually undermined precision, undermined clarity, and produced repetition. And the people who had performed well initially were not well situated to see that, given the time constraints. That’s what I think happened.

At the end of the article, we try to say, “Okay, here are some of the lessons about how to use AI effectively.” One of the core lessons from this—and again, we need more data to develop more lessons and figure this out—is that you have to be wary of using AI when you’re under time constraints or cognitive constraints. That’s when errors occur. That’s when it can actually undermine the quality of your work. So that’s my sense, but I think we need more empirical evidence to really get deeper on that question.

Jen Leonard: Hopefully we’ll have more empirical evidence in the years ahead. It also seems to echo the study of BCG consultants from a couple of years ago, where AI helped most of the performers, but weakened the strong performers.

Daniel Schwarcz: I think that’s right, although it’s hard because one of the things that’s really tricky in this field is that the quality of the AIs is changing so much. That BCG study—I can’t remember exactly what model they were using, but I think it might have been the initial ChatGPT or maybe GPT-4.

In some of my prior research, we had actually found the same result. I think in that one, we were testing GPT-4, and we found that it had this leveling effect, where it helped the participants who had not done as well on a baseline test, but it hurt others. But in subsequent research, when we were testing O1 reasoning and also a retrieval-augmented generation model, we found that it helped everyone, including our high performers.

So, I think there’s a question. On one hand, if the model is more capable, it is going to be able to help more people. But I also think part of what was going on here was not just that question. We were dealing with a very particular use case: participants had thought something through, written something, tried to polish it, and then used AI to revise.

So, I think there’s something particular to the revision task, with limited time, that may be producing the result we saw here. That might be distinguishable from the earlier results, where you might just say, “Well, the AI was not nearly as capable as it is now.” And if it’s not as capable, it’s just not going to help people who are already quite capable.

What Law Schools and Law Firms Should Do Next

Jen Leonard: So, your study focused on new lawyers, which we’ve been exploring and thinking about and worried about. But do you think the outcomes would differ for more seasoned attorneys?

Daniel Schwarcz: One thing I would say is that this is one of the comments I get most often. And I always say, “I would love to do that experiment. I just need a lot more money.” We’ve been able to do these experiments with great law students at the University of Minnesota—and I’ll take a second to say they’re fantastic. They’re great law students. But of course, they’re not real lawyers. We think they’re pretty good proxies for first- or second-year associates. Certainly, we have summer associates who do very good work, and law firms often give real work to summer associates.

On the other hand, we don’t know how much these results would extend, or how they would differ, if you were dealing with seasoned attorneys. I’d love to do that work. It’s just a lot more difficult because you need a large number of people. You probably need at least 100 participants, or somewhere in that range, to get the statistical power. And paying for 100 attorneys to do three hours of work is just not easy. So, I haven’t been able to do it yet. I’m always looking for new funding sources, so if anyone listening has money to give away, please contact me.

I can only speculate. But in speculating, I actually want to go back to my AI Aha! moment from earlier. I do believe that senior attorneys can get a lot more out of AI than junior attorneys. I think that’s true not only because senior attorneys are better at deploying some of the tools we talked about—asking follow-up questions, asking for clarity, asking hypotheticals—but also because those are skills we cultivate over the course of our entire careers.

Another core lesson I focus on when I do trainings and talk to learners about how to use AI is this: the best use cases for AI are when you, as the user, are really well situated to evaluate the accuracy and quality of the output.

If you think about junior lawyers versus senior lawyers, a huge part of what senior lawyers do is say, “Okay, this looks good,” or “This isn’t good—redo it,” or “This is good, but did you consider this?” Or “I think there might be a case on this point. Go figure that out.” Or “Did you consider this counterargument? Go back and do that.” A big part of what senior lawyers are good at is evaluating whether work product is good or not, where it might be deficient, where it might be extended, and what else might be needed. And that’s exactly how you want to use AI.

If you’re really well situated to say, “This is great,” then fantastic. So, I predict that, in general, senior attorneys can use AI more effectively and get more use out of AI than junior attorneys. I haven’t been able to test that. Whether that translates into being able to use AI to help you when you no longer have AI is harder to say.

But I also predict there is less risk of long-term skills degradation for senior attorneys, because the skills are already baked in. That’s not to say there’s no risk. But for senior attorneys who have cultivated good writing skills and good drafting skills over time, I think using AI is less likely to degrade those skills. That’s a partial answer, but it’s hard to know for sure until we get the data.

And I really hope we can continue to get more data. One thing I’m very confident of—even though this is data-free speculation—is that AI is going to fundamentally transform the practice of law. How? To what extent? In what ways? I’m less sure of that. But if AI is going to be the most disruptive force to law that we’ve had in decades, if not ever, then it behooves all of us to get empirical data, understand it, and be able to adjust seamlessly in how we train, how we practice, and how we serve our clients. So, we’ll see. Hopefully we’ll do that.

Bridget McCormack: I really hope you can keep doing this research. It is such an important contribution. Maybe I’ll end by asking you for takeaways, both for law schools and law firms. What are the takeaways for law schools based on this particular study? And what are the takeaways for law firms? What advice would you give law firms based on what you’ve learned through this work?

Daniel Schwarcz: Yeah. Well, I think the big takeaway for me is that people can’t be hiding from AI anymore. AI is very clearly capable of improving the speed and quality of the work people do, which is some of the work I’ve done previously. But it’s also capable of making us better lawyers.

At the same time, there are still risks. And we all know the risks. But those risks are risks that must be managed and addressed in a thoughtful way, as opposed to simply forgoing the underlying activity that creates the risk—which is the use of AI. So, I think everyone in the legal ecosystem needs to be thinking about it that way. This is a technology that has so many benefits that we need to be incorporating it intelligently. But we need to incorporate it in a way that manages risk. How do we manage the risks?

There are a few lessons, some of which we’ve already talked about. Make sure you’re not using AI when you’re tired. Make sure you’re using AI in contexts where you’re really well equipped to evaluate the quality of the outcome. Obviously, always check the AI results and the accuracy of the AI, which is something many folks forgo.

Another tip that falls out of our research design is making sure you’re using AI in a way that doesn’t ask it to perform many different tasks together, but instead breaks tasks down into small, bite-sized pieces. I think that’s one of the reasons we got the results we got.

If you’ll recall, our experimental design broke the task into bite-sized pieces. First, here are the relevant materials. Feed them into the AI. Now read the AI. Now think about the results. Now apply it to the scenario. When you break tasks down into small pieces and use AI for that, it allows you to preserve your own mental map of the pieces. So those are some takeaways in terms of how to use AI.

But the other core takeaway is constant reevaluation. Constantly thinking about: did this work well? Did this not work well? And constant practice. Those are some of the things I think we need to think about generally. As for law schools, I’m still a traditionalist when it comes to my first-year students. I actually ban AI use. I make sure that when they’re taking exams, they don’t have access to AI. It’s in a closed-book format where they physically are not capable of accessing AI.

And that’s because of the long-term skills issue, and because we don’t yet have data on long-term skills development. I’m pretty confident—even though I don’t have empirical evidence to back me up on this—that our first-year students and early lawyers need to develop the skills themselves in order to use AI. That means developing the skills we’ve been talking about: asking, “What about this hypothetical?” “Is that ambiguous language?” “What about this counterargument?” “How does this extend here?” “How do I argue in the alternative?” “How do I be precise?”

Those core lawyering skills need to be cultivated and developed in order to allow you to use AI. I think there are a lot of risks if you’re trying to cultivate and develop those skills simultaneously with using AI. So, I think we need a layered approach. Once you get to the second and third year of law school, then we need to start training people how to use AI. That means learning the empirical data. It means trying to distill principles. And most of all, it means practicing. Those are some of the takeaways I have, and hopefully we’ll get more and more takeaways that are empirically informed as the months and years go forward.

Bridget McCormack: Interestingly, in addition to continuing to evaluate what works and what doesn’t work, like you said, we have to do it with all the new models as well. All of this is changing every couple of weeks. And I assume even your results are something you would probably want, in a world of unlimited resources, to retest with different models, because the results could be different.

Daniel Schwarcz: I absolutely agree with you. I think the research agenda here is never going to be done. The research is incredibly helpful to set baselines and show trends, but the underlying technology is fundamentally changing. This has really been brought home in the last week or two with Anthropic’s latest model. I keep thinking, okay, we have this model that apparently can 100x the capacity of people who do cybersecurity to detect zero-day bugs.

And I think, okay, I know there are differences with law. Reinforcement learning is harder in law because there are not as many verifiable, tight feedback loops you can have. It’s not always clear what the right answer is.

But even so, you hear about superintelligence in the context of cybersecurity and coding. And it’s hard to imagine we’re not going to see similar jumps in capabilities. So bottom line, you’re right. We need to be constantly reevaluating this and constantly asking these questions. Our answers give us baselines, but they don’t tell us where we are.

Jen Leonard: Well, thank you so much for joining us, Daniel. As Bridget said, your research is so important. There’s frequently a dearth of academic research around the profession, and Bridget and I are out there so often trying to help practitioners and leaders think through these things. Every time one of you and your colleagues’ studies comes out, we have new material to talk with them about.

We’re grateful to you on behalf of the entire profession. Thanks for joining us. And thanks to everybody out there who’s listening or watching today. We appreciate all of you as well, and we look forward to seeing you on the next episode of AI and the Future of Law. Until then, take care.

Key Takeaways

Final Thoughts

Transcript

AI Aha!

Testing AI and Human Legal Reasoning

What the Study Found

What Law Schools and Law Firms Should Do Next

Discover more

The OpenAI Lawsuit, AI Governance, and Legalweek 2026

Jason Barnwell on AI Agents, Contract Lifecycle Management, and the Future of Legal Work

Jae Um on Power Rankings, AI, and the New Big Law Tournament