The Impact of Generative AI & Driving Innovation within the Court System

 

 

Summary

In this episode, Bridget McCormack and Jen Leonard explore the transformative impact of generative AI on the court system. Drawing from their personal experiences, they discuss how generative AI has influenced their legal research and even their triathlon training plans. The episode unpacks the concepts of Retrieval Augmented Generation (RAG) and the balance between precision and imprecision in generative AI, highlighting the challenges and opportunities this technology presents for courts and judges.

As they delve deeper, McCormack and Leonard address the varying reactions of courts to generative AI—some embracing the innovation, while others are more cautious. They emphasize the critical role of leadership in driving innovation within the judiciary and overcoming skepticism towards AI. The conversation also shines a light on a strategic partnership between the National Center for State Courts and Thompson Reuters Institute, designed to educate the judiciary about AI and develop informed policy responses.

Key Takeaways

  • Courts are cautiously curious: Judges and administrators are interested in AI but face barriers like limited resources, lack of training, and uncertainty about how to implement tools safely.
  • Leadership drives innovation: Courts that are experimenting with AI typically have forward-thinking leaders who encourage experimentation and strategic use of the technology.
  • AI is better for creativity than precision: Generative AI is great for brainstorming and strategic support, but struggles with tasks that demand high accuracy—hallucinations and false positives remain a risk, even with RAG (Retrieval Augmented Generation).
  • Legal rules around AI are still evolving: The Fifth Circuit's rescinded rule shows the legal system is still figuring out how to regulate AI use—many current ethical obligations already address accuracy and honesty in filings.
  • National efforts are underway: The National Center for State Courts and Thomson Reuters Institute launched a partnership to support courts in responsibly adopting AI, with a focus on education, policy, and expanding access to justice.

Transcript

Jen Leonard: Hi everybody, and welcome back to 2030 Vision, AI and the Future of Law. We are your hosts, Bridget McCormick and Jen Leonard, and we are thrilled today to be talking about a really vital issue: how does generative AI impact the court system? I'm really excited to learn a lot about this from Bridget, and I'm going to step back because this is not my area of expertise. As the former Chief Justice of the Michigan Supreme Court and an active member in lots of efforts to expand access to justice and to educate across the profession (including the judiciary), I know Bridget will have lots of thoughts about the courts. So Bridget, I'm looking forward to engaging with you on that if you're game for it.

Bridget McCormack: I am, and we're probably going to do this over a few episodes — at least two, maybe more. I'm sure we'll come back to it, because there's no way you can cover all of the things happening in courts in one episode. So we'll just get started on the conversation today, and then I look forward to continuing it over the next several episodes.

Jen Leonard: Same. And as we always do, we'll start with our Gen AI moment of the last couple of weeks. So, what have you been using generative AI for, Bridget, that you have found to be magical and delightful?

Gen AI Moments

Bridget McCormack: So, I don't know if these are two boring ones, but I have two quick ones. After interviewing Adam Unikowsky (whose experiment with Claude and Supreme Court briefs we'll talk about in a future episode), I interviewed him last week and was so impressed with his legal results that I started using Claude for legal research. Obviously, I'm not relying on it entirely — you know, I understand I have to actually do the work and make sure it's correct — but I had a sort of wonky immigration/tax/employment question. I just needed to understand what the barriers and potential regulatory framework were that I had to worry about before I talked to a lawyer. I was going to be talking to a lawyer, but I wanted to use that time efficiently (because I wanted to pay for fewer lawyer minutes). 

So I did my sort of "teach me the basics" routine with Claude, and it was actually fantastic. It gave me exactly the right questions to ask. It made the conversation far more sophisticated and efficient than it otherwise would have been. So that was one example of how I'm starting to use frontier models just to teach myself a legal framework when I'm in a new area.

And then the second thing was I asked Claude to do a triathlon training plan for me. I rode a century (100 miles) a week ago Saturday, so my cycling is really good this summer, but I haven't been running or swimming at all. I would love to do a couple of little sprint triathlons in September — Michigan has wonderful lakes, so it would be fun to do one. But, you know, there are all these online programs for how to train for a sprint triathlon, and this is the first time I've been able to say I'm in really good biking shape. I just biked a hundred miles in a day about a week and a half ago. So I wanted the AI to figure out how I can get good enough at running and swimming to do this race. And also, by the way, tell me what I should do about nutrition and sleep and all the rest. It gave me this lovely personalized little program. I have no idea if I'll be able to do it, but it's nice to have it. Now it's sitting next to my computer. How about you?

Jen Leonard: Those are really cool, and I'm glad to hear that you asked it about sleep because I'm not sure when you actually sleep! Every time I see you, you're flying from here to there and then biking 100 miles. So let me know how the sleep goes. 

I had sort of two experiences in the same activity. I was drafting, for my own work, a very simple professional services agreement. It was not complicated, not a huge dollar amount, but I just wanted something to codify the agreement. So I worked with Claude as well to generate it. I gave it some context, told the AI what the sort of scope of services was, and it generated a great first draft for me. I made a couple of adjustments. And then, as part of that drafting process, I was trying to find the principal place of business for the other party and used Perplexity for that — which is sort of, if people aren't familiar, an AI-infused version of Google where you can ask questions and get an AI-generated summary, but it also helpfully has links to the underlying sources. So I was able to click on the links and verify through public documents that it was the correct principal place of business. I feel like those are the kinds of things that used to take me, you know, a good 15 minutes to find — maybe not that long, but it definitely saved a lot of time. It was a lot of fun and there's no value-add in doing that activity. So I found that to be very delightful. Not as fun as triathlon training, though!

Bridget McCormack: Yeah, we'll see if I follow through, but it's fun to have it all written down. Perplexity has been a game changer for me. It's really made Googling a last-resort tool for me because Perplexity kind of ties it all together for you and gives you the links to verify information right away.

Jen Leonard: The links are so key because I feel like it's a helpful way to start to solve some of the hallucination issues. You can actually click on the source material, like you said, versus just relying on the text output.

Definitions: RAG, Precision vs Imprecision

Jen Leonard: Okay, so our next segment that we always do is helping people understand a couple of definitions or concepts that are unique to AI and generative AI. This week we're going to talk about two. The first is RAG. If you're hearing presentations about generative AI, you might hear people invoke "RAG." What is RAG, Bridget?

Bridget McCormack: So, as always with this section of the podcast, I'm going to do my best — and you're going to correct me if you think I haven't gotten it quite right or could add something. RAG stands for "retrieval augmented generation." The idea is that you take your generative AI model and give it a second data set that's relevant to the question it's designed to answer, basically augmenting its original training data with some specific additional information.

We used RAG — this training process — in building our Clause Builder AI tool, which is a tool built on ChatGPT Turbo, or maybe it's GPT-4. I'd have to check with my engineers. But we used a RAG process where we gave it an extra set of training data, which was a set of perfected clauses that have been upheld by courts. So we know these are good clauses. So when you're actually trying to have it write a clause, it can go look in that specific data set — this data set that's really relevant to the question that it's designed to answer — before it builds on its original training data. Does that sound about right? Am I missing something important there?

Jen Leonard: No, that's consistent with my understanding. It's kind of like a closed universe that you can work with directly. And I know there's discussion about a lot of the benefits and some of the challenges — which we'll talk about throughout our episodes as well. But that's how I understand it.

Bridget McCormack: So I have another one. Tell me, when people talk about precision versus imprecision — it's an issue you hear discussed in the context of generative AI — what do they mean by that?

Jen Leonard: Yeah, I've been thinking about this a lot recently because I think you and I are both super excited about the future of AI, but I think the early notion that you could snap your fingers and have it do any type of activity has led people to try things it's not perfectly designed for, and they come away disappointed. It is a probabilistic machine: it's making guesses based on its training data as to the next most likely word in a sequence. 

If people use it for some sort of really, really precision-oriented activity, the probabilistic nature of it means they'll often have disappointing outcomes. It will miss things or it will hallucinate. So it's important to think about what the technology is good for at the moment and to use it accordingly, leaning in where it excels. What do you think about it?

So thinking about how the technology is designed, and using it for less precise activity — at least for me, for now — has been much more useful than trying to take a document, for example, upload it, and ask it for a direct quote from that document. It won’t give you the direct quote.

And I know Ethan Mollick this week — whom we both follow religiously — talked just about this. One of the points he made is that people are really used to Type I errors in search, where there’s a false negative — where it tells you a document wasn’t found when in fact it is in the data set, it just didn’t surface it. But we’re not used to Type II errors, the false positives — which is the hallucination effect. We’re not used to something confidently telling us something is there when it doesn’t exist at all.

So I think it's helpful to think about what the technology is good for at the moment, and use it — and lean in — there.

Bridget McCormack: Yeah, I think that's right. I'm not sure I fully appreciated the nuance, though, and the difference between these outputs. The Type II errors are an issue even when the frontier model is using RAG training at least according to what Ethan Mollick was saying. 

So I had this interesting example: Right now, we're experimenting with an HR one, uploading all of our HR materials to train these chatbots so that our employees can ask questions they would otherwise send to the HR staff — but they might be more comfortable asking a chatbot. And it allows the HR staff to work on some of the more impactful things, while also giving some employees a sense of privacy, which they might actually prefer.

We tried a few different models — and in one of the questions we asked it, it answered incorrectly. We knew that wasn't the correct policy. And we thought, "Well, that's really weird because it was trained on the HR data." But in fact, we pulled up our HR handbook — and we had the wrong information in the HR handbook.

But it was sort of interesting. I thought the RAG process cut down on — well, it does cut down on hallucinations — but the point is, with this technology, it doesn’t eliminate them.

Jen Leonard: That’s right. And it’s also why I think you and I have found it to be really useful for things like brainstorming and strategic planning and creative thought partnership. Because like a human partner, that doesn't need to be so precise. You can work out the precision later.

But using it as sort of an upgraded version of “Control-F” to pull things out of documents — that’s much more difficult for it right now.

Okay, so those are our concepts of the week and our Gen AI moments. Now we want to dive into what generative AI means for courts and judges in particular. So Bridget, I'm just going to start by asking you to share some of your thoughts about courts and AI.

Main Topic: Generative AI in the Courts: Transforming Justice Systems

Bridget McCormack: Yeah. And again, we're only going to get through a little of this today, and then we'll continue this conversation in our next episode (and probably many more after that). In some ways, I think this technology is really exciting for courts because of the ways it can help them build tools and services that are really hard to create with human resources given their budgets and other constraints. There is a ton of upside potential.

On the other hand, there's a lot of concern, fear, and worry among court leaders about the ways in which they're not going to be able to keep up with the technology. I've been doing a lot of presentations to judges, courts, and court teams — court administrators — on this topic. It's usually pretty 101-level content (which is good for me, because, you know, I feel good at 101 and not necessarily 201, so 101 is about right). But they are very curious, I would say. I've now done these presentations in many states, usually to appellate courts or state supreme courts, and sometimes to the entire judicial body that governs the administrative process. We've seen lots of court orders — some early ones that were pretty aggressive about banning all use of generative AI by lawyers in proceedings, and then some more nuanced ones. We can talk a little more about that in a minute.

For the most part, courts are both nervous about how this technology might upset their processes and worried that they're going to have a hard time keeping up with it to meet the demands they might see. They're also unsure where they're going to get the training or the technologists to help them. I was at a presentation recently with all of the appellate judges of a state. They had their engineering lead there, and even their chief technology officer had not had any experimentation with the technology. I don't think that was because he wasn't curious — he certainly was. I think there was just a lack of resources, a lack of education, a lack of information. Those things are difficult for courts in every way, but perhaps most difficult when it comes to new technology. So there's a lot of concern out there.

Jen Leonard: What do you perceive to be the difference between courts where you've presented and concern drives the behavior, and courts where you've presented where some of the optimism and excitement — once we can figure out how to channel resources in that direction — prevail?

Bridget McCormack: Yeah, that's a really interesting thing. In some ways, this is just the latest example of courts where you see innovation and courts where you see very little innovation. I saw lots of innovation in certain courts throughout the pandemic, and you do see that now. 

For example, the Maricopa County court is building chatbots for self-represented litigants using this technology, and it's a game changer, right? I mean, those were questions that used to come to one human (or maybe two — I don't know how many people they have) who could answer the public's questions. There's now an unlimited number of AI assistants that can answer them, right? And you can train them in specific areas in a way that's a really wonderful public service. 

There's David Slayton, who's a court administrator in LA County, partnering with a team at Stanford to try to figure out how to redesign what courts should be for the public who needs them. He views this technology as a huge game changer for what they might be able to do for the people they serve. I had a conversation with the current Chief Justice of the Michigan Supreme Court last week — she’s hiring new clerks, three new clerks coming in, and she specifically hired people who were eager to play with generative AI tools and technology to improve the work of the courts. So you do see those examples; they're rare. 

I would say the difference between a court that's leaning into innovation around this technology and one that isn't is usually a single person — a leader, either a court administrator or a chief judge or chief justice — who views this technology as an opportunity to do more or do better than the alternative. You know, judges are lawyers, and lawyers are skeptical by nature and risk-averse by training. They come by all that honestly. So it's not surprising that it's unusual to see someone in power lean into technology like this when none of us fully understand it yet.

Jen Leonard: I think it's so interesting because, as you mentioned, one of the challenges right now for everybody is finding talent on the technology side to help figure out how to use this well. The human leadership piece is so key. I see it in law firms a lot when I'm traveling: you see firms where there's a huge amount of skepticism among leadership versus those that understand the limitations but are optimistic. 

We see it in law schools with people who are really open to driving innovation and those that aren't. It really becomes contagious across the organization and can permeate into the field. I watched this with you personally: you were one of the most innovative Supreme Court justices in the country during the pandemic, and I remember watching a conversation you had with another judge. The other judge said, "It never actually occurred to me that we could try these things." And so I've seen it take place, and I've seen the excitement and enthusiasm of people saying, "I love what Bridget's doing. I think that's really cool, and I want to follow it."

Bridget McCormack: There are lots of people doing far more innovative things than I've ever done, but those efforts really are people-led. You're right — it's the same in law firms, and probably the same in every other organization. If your leadership team is leaning in, I bet your whole organization gets excited. That's when you find all kinds of new and better ways of doing what you do as a result of the technology.

Jen Leonard: Yeah, when somebody in power tells you it's okay to experiment and try, that really drives change. Okay, I want to drill down on a specific instance of a court's efforts to create rules around generative AI usage and what happened. I'd love your thoughts on what it says about the broader AI landscape. And you can correct me if I misstate any parts of this. 

The U.S. Court of Appeals for the Fifth Circuit (a federal appellate court) recently proposed a rule that would have required attorneys to certify the accuracy of any legal filings involving the use of generative AI. They were concerned — after things like the ChatGPT lawyer incident — and wanted to ensure they could rely on the information being submitted to the court. 

lawyers resisted the rule, in part arguing that the existing rules on the books (both the Federal Rules of Civil Procedure and the rules of professional conduct that govern lawyers) already provide a structure to ensure that filings are verified, accurate, and honest. They also raised issues about how a court could ever determine when something was written using AI — not to mention the fact that we use AI all the time (traditional AI) in our research products and writing tools. Ultimately, because of these criticisms and concerns, the Fifth Circuit rescinded its proposed rule. So I'd love to get your thoughts on what unfolded and what it says more broadly about AI's evolution.

Bridget McCormack: Yeah, I kind of love this Fifth Circuit example because it's a bit of a metaphor for how the legal profession has been adapting to this disruptive technology. We saw lots of court orders coming out pretty quickly when that one lawyer — who sort of set us all back by like four years — copied and pasted from ChatGPT into one brief. The New York Times wrote 25 stories about that one lawyer, and a lot of courts said, "Boy, we'd better issue an order saying either you can't do it at all or..."

In some cases, if you're appearing pro hac vice, you can't do it (meaning you're licensed in another jurisdiction). Also, some courts say self-represented litigants can't use it, which is crazy. But anyway, all these different orders are all over the place. The Fifth Circuit was one of the early ones out of the gate. It issued an order basically saying lawyers had to certify that if they used AI, they then checked everything in it.

And it was very generic. It was like, "If you use AI, you have to check everything in it" — which, you know, for anything you file with a court, you already have ethical obligations to make sure it's accurate. Those obligations already exist. When the Fifth Circuit first issued the rule, I think the court itself... I don't know the Fifth Circuit's process, but in the Michigan Supreme Court when we issued a draft rule, we voted it out the door (meaning we were all on board with getting public comment). So I assume the Fifth Circuit thought, "Well, this is probably a good idea. We'll hear from the public." And the public comment built and built. 

There was a very thoughtful comment from Judge Scott Schlegel — who you know, because we had him come to our class. He's a judge on the Court of Appeals in Louisiana (which is in the Fifth Circuit). He's a very innovative judge who's used all kinds of off-the-shelf technology when he was in a trial court to improve the experience for users in his courtroom. He's actually a great follow on LinkedIn. He's always thinking about this technology. He wrote a great comment making all of these points — basically saying, respectfully, colleagues, we already have obligations that if we file something with a court, we make sure it's accurate. His comment made it clear that the Fifth Circuit's proposed rule made it sound like the court didn't really understand what the technology was about. 

And I think the other comments went in that direction. By the end of the comment period, it was actually a good advertisement for rulemaking, right? Because it worked. The public basically said, "You know what, we already have rules that require us to do this, so this is not a good idea," and the court withdrew the rule. I feel like that arc is the arc many of us have been going through — many in the legal profession have been going through. Most people's first reaction — most lawyers' first reaction — was, "God, this can't be good for us, this must be bad for us." And then, you know, like with every other new technology, they thought, "Well, maybe we can handle it. Maybe we can figure out how not to have it get us into trouble. Probably we can," and it's been two steps forward, one step back. So it feels like a bit of a metaphor.

Bridget McCormack: If people are interested in following all the different court orders out there, there are so many. Some are by individual judges and some by particular courts. The RAILS project at Duke Law — RAILS stands for Responsible AI and Legal Services — has a website where they're collecting them. I think when I last looked, there were dozens and dozens from courts not only in the U.S. but outside the U.S. as well. So that's an interesting place if you're interested in seeing what different courts are doing. What I want to know — and I can't find this online right now — is how many of those orders that were issued originally have been amended or rescinded. I would love to see that: what they all looked like and then how they changed.

Jen Leonard: Well, I logged onto RAILS today and by my most recent count there are 52 orders from different courts about AI usage. And I think your point about the public comment period and response is so interesting, because we've talked a lot about access to justice and the idea of getting more of the public involved in commentary around different rules that could help self-represented litigants. 

In this case, you're talking about a community that is much more likely to respond — the lawyers — because they know how to comment on proposed rules. So how do we take that lawyer engagement and try to use it to make sure that we're not prohibiting self-represented litigants from taking advantage of this technology?

Bridget McCormack: Yeah, and this could be its own podcast episode, but I have a lot of thoughts on this. You know, we do take comment from the public on rulemaking, and rulemaking by courts is really important. It's sometimes the difference between whether you can have your complaint heard or not — really, some rules can keep you out of court or let you in. And we post them on court websites. I don't think most members of the public who might have a justice problem are perusing court websites to see if there's some potential rule change that might impact their ability to get justice. So it's not a great process. 

There have been some really well-organized efforts when an important rule change in a state was going to impact the public tremendously, where legal services organizations or organizations that work on materials for self-represented litigants helped get the information out to the public so they could be heard on an issue that impacts them. We had that happen when we were trying to decide what to do with remote hearings after the pandemic was mostly behind us — whether we should continue to have remote hearings. 

Lawyers had one view; judges had a very strong view in one direction. But really it's the public that you want to hear from, right? That's who the courts are for. So there was a good effort and we got a good response from the public on that. It makes a difference.

Jen Leonard: Yeah, absolutely. In a future episode, we could talk about some of the rules coming out around self-represented litigants, unauthorized practice of law, and the intersection with AI — because there's a lot to unpack there. But we'll save that for another day. Today, on our last topic: if you are a judge or a court administrator, or you're practicing before a court, it just seems so difficult right now to wrangle all this information and try to keep track of it.

Main Topic: National Center for State Courts and the Thomson Reuters Institute Collaboration

Jen Leonard: I was interested to learn from you that there's a new collaboration between the National Center for State Courts and the Thomson Reuters Institute. They recently announced a strategic partnership focusing on AI in the legal sector, particularly its application in courts. The goals of the partnership are to inform and educate the judiciary about all the opportunities and challenges created by AI, to help judges make informed decisions, to expand access to justice, and to develop policy responses so that we protect the rule of law from some of the potential challenges. Could you tell us a little bit about the partnership, Bridget, and why you think it's so important?

Bridget McCormack: I'm excited about this partnership because the National Center for State Courts, of course, is the only national organization that provides state courts with resources, training, and support. It's the main place state court leaders and court administrators go to get help and information on things that are challenging for courts. So I'm glad they're jumping into this. And the Thomson Reuters Institute is in the business of putting together resources and pulling together information to assist people in the legal profession. So it seems like a promising partnership to explore those four goals. 

I'm actually on this council — we've had one meeting so far — and they have me chairing the access to justice workstream. I'm excited about that and have suggested some other folks I'd love to have join that effort, because I do think if you have creative people collaborating — you hear me say this all the time — across silos, we might be able to produce some pretty interesting new ways of doing business, new products, new services, new ways of thinking about what we do in this disruptive moment with this disruptive, magical technology. Access to justice is one of the areas, as you know, I'm most excited about. 

With Thomson Reuters' resources and the National Center for State Courts' resources, we can move quickly, I hope, to put some things together to help court leaders around the country. It's a space that I hope people follow, and when we have updates I'll make sure to tell you about it.

Jen Leonard: Well, it's an exciting partnership, and it really brings us full circle to the beginning of our conversation — because I'm now freshly concerned about your triathlon sleeping schedule, since you have another role to take on. When are you going to sleep? We have to get you ready for the sprint triathlon!

Bridget McCormack: No, sleep turns out to be really important, so I'm gonna, you know, prioritize it. That's what I have to do — I'm told. That's what Claude told me to do: prioritize.

Jen Leonard: (Laughing) Follow the boss, Claude. Well, this was a great kickoff to our conversations about courts. We'll talk in a future episode about how some judges are experimenting with using generative AI directly in their rulings and opinions, and the various reactions to that. But this was a wonderful overview of where the courts are and the potential opportunities and challenges.