AI vs. Lawyers – Who Said It Better? A Deep Dive into the Future of Law on the 2030 Vision Podcast

Summary
What happens when you put artificial intelligence up against real human lawyers on legal tasks? What if the AI is just as accurate but uses 1/8 of the words?
In episode 21 of the 2030 Vision Podcast, hosts Bridget McCormack and Jen Leonard explore how AI is reshaping the legal industry. The duo tackles the big, messy, thrilling question: What happens to lawyering when AI starts doing it just as well—or better—than us?
Key Takeaways
1. The 16,000-Character Lawyer
One standout segment focused on a new benchmarking report by VALS, an independent organization evaluating legal AI tools. One surprising finding? AI is nearly as accurate as lawyers but far more concise.
“The response length for AI was around 2,000–3,000 characters. For lawyers? Nearly 16,000. So, we’re saying the same thing with eight times the words.”
As Jen put it, “It’s really hard to say things with fewer words. Like, it takes a lot more work.” Bridget added, “The only thing I like better than a two-word sentence is a one-word sentence.”
It turns out that brevity really is the soul of wit—and maybe legal efficiency.
2. AI Aha Moments
The episode began, as usual, with “AI Aha” moments—real-world examples demonstrating AI's unexpected power.
One moment featured a law professor who saw her detailed eviction law research replicated by an AI tool in minutes, leaving her stunned. Another moment came from a conversation with a public interest lawyer overwhelmed with communications work who watched AI effortlessly generate press releases, talking points and even a 30-day social media calendar.
Sometimes, the only way to convince people is to show them.
3. The AI Teammate: Better, Faster, Happier?
The podcast also explored a new study from Wharton on how AI impacts teamwork. The results were striking—people who used AI to solve real business problems were more productive, engaged and less anxious.
The legal implications are huge. Imagine dissolving traditional boundaries between practice areas. Imagine junior lawyers working across disciplines, supported by AI. Imagine legal teams built around AI copilots instead of conventional structures.
“It might mean the end of the tax attorney, the real estate attorney, the M&A attorney being separate roles on one matter,” said Jen. “Maybe one attorney—with AI—can do it all.”
4. VALS Benchmarking: Are We... Worse Than AI?
The VALS report compared the performance of top legal AI tools—Harvey, CoCounsel, Vincent AI, and Oliver—against human lawyers on tasks like document review, data extraction and redlining.
The takeaway? AI tools often outperformed lawyers. Not in every category—humans still had the edge on nuanced redlining—but AI held its own in most areas. And AI tools are still improving.
“They’re already on par. And they’re only going to get better,” Bridget noted. “We’re not. At least not before I retire.”
5. Let Go of the Ego
One of the podcast’s recurring themes is the legal profession’s deeply held belief in its own perfection. Jason Barnwell of Microsoft captured it best in his reflections on the VALS report:
- Human work isn’t flawless.
- Accuracy alone isn’t the full measure of value—speed and cost matter.
- It’s not either human intelligence or AI—it’s both.
That shift in mindset may be what’s needed most.
6. Law School, Legal Practice... and “What’s the Point?”
Jen and Bridget also revisited the anxiety of law school, where students struggle to grasp meaning from opaque cases and cryptic exam samples. They argued that AI could radically transform legal education.
“Imagine uploading a sample exam to an AI and asking: What’s good about this?” said Jen. “That’s how it should be.”
The technology is already here. Now, it’s about raising awareness and making it accessible.
Final Thoughts
As the episode wrapped, the message was clear: AI isn’t replacing lawyers. It’s augmenting them—making them faster, more effective and maybe even happier in their work.
Whether you’re an AI skeptic or enthusiast, the 2030 Vision Podcast is your space to explore the evolving relationship between law and technology.
Listen in. Learn something new. And maybe try writing that next client memo in fewer than 16,000 characters.
Subscribe to the 2030 Vision Podcast on your favorite platform to stay ahead of the curve in law and technology. The future is here, and it’s more efficient than ever.
Watch Episode 21
Transcript
Below is the full transcript of Episode 21 of The 2030 Vision Podcast.
Jen Leonard: Hi, everyone. Welcome back to another edition of 2030 Vision: AI and the Future of Law. We're here on-site in New York at Legal Week, recording our second of two episodes today—live and in person.
As we do every episode, we have three segments:
- AI Aha! — where we share something surprising or exciting we’ve discovered using AI
- What Just Happened — where we connect recent tech news to its impact on the legal world
- And our Deep Dive — today’s topic is the recent benchmarking reports from VALS, which evaluate the accuracy and performance of different legal GenAI tools.
Let’s kick it off with your AI Aha! for this edition.
AI Aha!: From Eviction Law to Epiphanies—How DeepResearch Changes Minds
Bridget McCormack: My AI Aha! this episode is about introducing AI to someone who had never used it before. Sometimes, the only way to convince someone is to show them in real time.
I was with a dear friend who’s on the faculty at a T14 law school. She had just finished writing an important paper on eviction law, comparing how it differs across states. It was a big project, involving several student researchers, and made a valuable contribution to the field.
As she was describing her work, I had my laptop open—because we’d just been online shopping—and I happened to have DeepResearch up. I typed in the same question she and her students had been investigating. DeepResearch asked me some follow-ups, I asked her those questions, and fed the answers back in.
Twenty minutes later, I showed her the output. She was stunned—literally calling over her partner, amazed at what she was seeing.
It reminded me that showing is far more powerful than telling. For people who feel hesitant or overwhelmed, even just signing up for a subscription is a big barrier. But if they see something applied to work they personally care about, it can change everything.
Jen Leonard: Absolutely. Especially for academics—tools like DeepResearch are like having 20 research assistants at your fingertips.
Bridget McCormack: Once you’ve seen what it can do, it’s hard to imagine starting a new research project without it.
Jen Leonard: My AI Aha! this week is about helping someone else see the power of these tools—specifically, someone working in public service.
As you know, I used to work in city government, and a lot of my friends are still in government and public interest organizations.
That reminded me of a moment on the Ezra Klein Podcast a few weeks ago. Ezra was interviewing one of President Biden’s top AI advisors and asked, “Why aren’t we doing more to address the impact of AI on work and education?” The advisor admitted, “We didn’t have access to Claude.” Ezra’s response was, “Well… that’s kind of damning, isn’t it?”
That line really stayed with me. I remember when I worked in government—so many tools were blocked, firewalled, or unavailable. And yet, the potential for impact in public service is huge. These tools could help overburdened public servants reach more people, respond more quickly, and operate more strategically.
One friend was rolling out a new initiative—he had to write a press release, prep for a press conference, anticipate public objections, and respond to them. He said to me, “I just don’t have time to do all of this, even though I know it would help people.”d
So I said, “Let me introduce you to my friend Claude.”
He told Claude what the initiative was, who it would help, and what the goals were. Claude drafted a clear, effective press release—on the first try. It even referenced past city programs and aligned with mayoral priorities, like something a seasoned press officer would write.
Then I said, “Ask Claude to create talking points for your press conference.” It did.
“Ask it to take the position of a cranky reporter who hates your plan.” Done.
“Rewrite it at an eighth-grade reading level.” Done.
“Now create a 30-day social media calendar tailored to different platforms.” Done.
It was amazing watching his face as he saw what was possible. You could almost see the gears turning—realizing how much more he could do, faster, and with less burnout.
We've talked about this before—there's this fear that AI will be unleashed on high-stakes issues involving vulnerable populations. But there are also so many low-stakes, time-intensive tasks that can be offloaded to AI, freeing up human capacity for the work that truly needs it.
That was my AI Aha!—watching someone go from skepticism to belief in real time. And seeing how AI can empower the people doing the most important, often invisible, work.
Bridget McCormack: There was a story making the rounds on BlueSky this weekend—an academic on a plane sat next to a student who was feeding a research assignment into AI, maybe ChatGPT or Claude. The student was generating a draft and then running it through a tool that stripped AI signals so it could pass plagiarism detectors.
She kept editing and checking until it passed the AI fraud screen. The professor was critical: “Can you believe she spent all that time trying to cheat?” But I thought—maybe she’s just preparing for the jobs of the future. She’s learning to work with the technology, not just around it.
And someone should tell her—she could automate that whole process and save even more time. Honestly, she's just preparing for the future. She’ll be fine.
Jen Leonard: Did you catch that Hard Fork episode about the Columbia student? They interviewed a sophomore applying for tech jobs—Amazon, Google, Meta. In that industry, applicants take a live skills assessment while being monitored on camera.
He developed a tool that uses generative AI to solve the problems and subtly show him the answers without moving his eyes—so the proctor wouldn’t notice. Then he used it himself, and started selling it to other students. Once the companies found out, his job offers were rescinded, then after that, he was flooded with job offers again.
Bridget McCormack: Right? Because of course—he just exposed how broken the test is. His pitch was basically: "Stupid tests deserve stupid shortcuts." It’s not a meaningful way to assess talent anymore.
Jen Leonard: Exactly. I couldn't help but think of similar outdated assessments in our own industry. He even said his company made $200,000 last month and is on track for $2 million. Casey Newton joked, “That’s about a year of Columbia tuition.”
And I don’t know if it’s fact, but it’s definitely true—most professors have no idea how much of this is happening. Until recently, we all had things we said were stupid but still had to do. Now there’s tech that actually lets students sidestep them entirely—and some are being very open about it.
Bridget McCormack: Those are the students who are going to run the companies in five years.
Jen Leonard: No doubt. I loved that story—and thanks for sharing the plane example.
What Just Happened: Ethan Mollick’s Cybernetic Teammate and the Future of Legal Teams
Jen Leonard: Our What Just Happened segment connects developments in the broader tech landscape to what they might mean for the legal profession. And of course, we’re talking again about Ethan Mollick—he’s just prolific, thoughtful, and always worth following. He’s a professor at Wharton, and if you’re not following him on LinkedIn, you should.
He and his colleagues recently published a research paper called The Cybernetic Teammate: A Field Experiment on Generative AI Reshaping Teamwork and Expertise. The research was designed to test how generative AI impacts teamwork across different functions within an organization.
Bridget McCormack: Yeah, it’s super interesting. You probably read it before I did—I think I texted you about it over the weekend, like I usually do when I catch up on reading. Ethan wrote about it in his Substack, and I was blown away. I could see so many direct applications for our teams at the Triple. And, of course, we already built a version of this experiment ourselves.
The researchers worked with two teams at Procter & Gamble—one R&D team and one business development team—on real product challenges. Some participants worked alone, others in two-person teams, and some were paired with GPT-4. They collected data across all those conditions.
The teams were cross-functional, combining R&D with commercial folks. They measured three outcomes: performance, whether they could bridge knowledge across specialties (what they called expertise sharing), and the emotional experience of the work.
Not surprisingly, AI made a difference on all three metrics.
For performance, individuals paired with AI performed as well as—or better than—teams without AI. So if I’m working with generative AI, I can perform at the level of a small team. AI also boosted productivity and innovation—something Ethan has shown in previous research.
On expertise sharing, the AI helped bridge knowledge silos. It balanced out missing expertise on teams and acted like a shortcut. And then there’s the emotional effect—this part really stood out to me. Participants using AI reported more positive emotions and fewer negative ones. They were more engaged, less anxious, and less frustrated than those working without AI.
Jen Leonard: That tracks with my experience too—especially the reduction in frustration. It’s not even about being anxious or angry. Sometimes coordinating with people just takes time. It’s magical to work with AI and get straight to the point.
It also echoes that University of Minnesota study we talked about—law students using AI reported enjoying the work more. And I get it! I would have loved to outsource some of my law school work to an AI. But isn’t that the point of law school? That it’s supposed to be hard?
Bridget McCormack: But is that the point? I used to be confident about that, and now I’m not so sure. Maybe that’s a new segment we need: What Is the Point?
Jen Leonard: Like the Columbia student we talked about in the last segment, I took the cybernetic teammate study and fed it into Claude and ChatGPT, asking: What are the implications of this for legal professionals—especially junior lawyers?
They gave thoughtful responses. One of the biggest takeaways was how this technology could lead to leaps in capability. If a first- or second-year associate can now do the work of an eighth- or ninth-year, what does that mean for development?
What really caught my attention, though, was the insight around cross-functionality. In the study, P&G’s commercial and R&D teams each gained new understanding of the other's domain, thanks to AI. The AI helped transfer expertise across those silos.
Both Claude and ChatGPT pointed out that in a legal context, this could dissolve practice area boundaries. You wouldn’t just need a tax attorney, a real estate attorney, and an M&A attorney on a matter. One lawyer, aided by AI, might now be able to cover all those areas.
Bridget McCormack: That’s such a “back to the future” moment. The legal profession moved toward specialization—everyone became an expert in one thing. You had the tax lawyer, the regulatory lawyer, the litigator.
That made sense when the law was opaque and high-risk decisions required deep expertise. But maybe the generalist lawyer—the one-stop-shop down the street—is coming back. And that could be good news! Better for quality of life, better for access to justice, and more sustainable for small-town or rural practices.
Jen Leonard: Maybe we are going back to the future. And maybe it’s better for our happiness, too. Research from Marty Seligman at Penn and David Epstein, who wrote Range, shows humans are naturally wired to be generalists—not hyper-specialists.
Plus, if AI reduces friction—because you don’t need to find a subject matter expert for every question—it might make teams more agile. That could be why specializations emerged: to reduce friction. But if AI can solve for that, maybe one lawyer can do more. And that’s powerful for solo and small practices looking to expand their offerings.
And maybe it pushes back on what Bill Henderson’s talked about for decades: the shift from “people law” to “corporate law.” AI might help us rebalance—and AI also raised another question when I prompted it: how does this change team structure and lawyer development in firms? Will we see associates working across disciplines, with partners and AI as part of the team?
Bridget McCormack: It opens up a huge opportunity to rethink how we structure firms and train lawyers. It’s an exciting time to lead a legal business.
Jen Leonard: There’s also a huge opportunity in professional development. That handoff from law school to practice has always been clunky, and legal PD hasn’t seen much innovation. But now? It’s a totally different game.
Bridget McCormack: And unlike law schools, PD programs aren’t constrained in the same way. There’s so much room for experimentation.
Jen Leonard: We’ve got to get Sharon on the phone—wherever she is, probably in a Jeep somewhere. But yes, the study’s emotional effects stood out to me too. The AI clearly ingested a lot about the anxiety junior lawyers feel.
Bridget McCormack: I saw myself in that research too. First year of law school, I was totally lost. No lawyers in my family, no context for what I was learning—I didn’t even know why we were reading the cases we were reading. Everyone else seemed to get it. I was just confused.
Jen Leonard: Same. I was a mess. I didn’t even go to study groups because I was too embarrassed. Everyone seemed to know more than me. Then I’d try to look at a sample exam in the library and just feel worse. It didn’t help at all.
But now? I’d upload that sample to Claude and ask, “What’s good about this? Walk me through it.” That’s what I needed. Just some background, some scaffolding—especially for first-gen students or anyone new to legal culture.
Exactly. I think I would’ve enjoyed law school more with those supports. Maybe not loved it—but definitely less anxiety. Law firms were scarier though. You go from that mess of law school into a high-stakes practice, where partners are juggling client demands and barely have time to mentor you.
Right, and those partners aren’t trained to mentor. They’re trying to teach you on top of billable hours in a system that never taught them how to train anyone.
That’s why this study stood out. If we can reduce anxiety and increase engagement, we can help junior lawyers unlock their own potential. They don’t need huge changes—just access to these tools. And they’re probably already aware, especially if they’re listening to the Columbia kid!
And it’s exciting to see how much of this research is reinforcing the same themes: AI elevates performance, reduces anxiety, and creates new possibilities for legal education, development, and the business of law.
I’m curious to hear more from your team as you keep testing this. Please report back!
Bridget McCormack: It’s a powerful shortcut. I definitely will. I’m excited.
Scaling Legal Innovation: Inside VAL’s Legal AI Benchmarking Report
Jen Leonard: Our big topic today is a new report released last month by VALS—an independent organization that benchmarks the performance of legal tech tools. This is one of the first times we’ve seen a truly independent evaluation of the platforms everyone’s been buzzing about in legal tech.
Bridget McCormack: It’s the first real benchmarking study in the legal space, which is interesting because Ethan Mollick and his colleagues have already been doing this in business and medicine. Since the release of ChatGPT, companies and healthcare providers have welcomed this kind of research. Doctors are publishing results—even when the AI outperforms them.
But in law, it took us this long to even start. Maybe it’s because we think we’re special.
Jen Leonard: Right? “We can’t benchmark—our work is privileged!”
Bridget McCormack: Or, “Everything in law is bespoke!” Which always makes me laugh—because if everything is bespoke, what are we even teaching in law schools?
Jen Leonard: So this study from VALS is the first time lawyers have seen a benchmarking report on legal AI tools—Harvey’s Assistant, Thomson Reuters' CoCounsel, Vincent AI from vLex, and Oliver from VECFLOW. Lexis+ AI was originally included but withdrew before the tasks began.
Researchers partnered with firms like Reed Smith, Fisher Phillips, McDermott Will & Emery, and Ogletree Deakins to define tasks for the AI tools, including: Data extraction, Document Q&A, Summarization, Redlining, Transcript analysis, Chronology generation, and EDGAR research.
Each tool’s output was measured for accuracy, and VALS partnered with Cognia Law—an ALSP—to source human lawyers who also completed these tasks. Their performance served as a baseline.
Bridget McCormack: That baseline is what makes the study meaningful. Without it, we’re just comparing AI tools to each other. But the human benchmark shows us what we’re really working with—and what’s possible.
Jen Leonard: Exactly. So, how did the tools perform?
Bridget McCormack: Overall? Pretty well. In most categories, the AI tools performed slightly better than the human lawyers. That didn’t surprise me. I use these tools regularly, and I’m not shocked that they outperform people in some areas.
Humans still did better in certain tasks—like redlining and contract review—probably because those require more judgment about what to focus on. But even there, it feels like the tools could catch up quickly with training. Unlike humans, they don’t get tired or distracted, and they don’t bring the same cognitive biases to the task.
That’s the big takeaway for me: as Andy Perlman often says, we need to ask, “Compared to what?” The tools are already performing at least on par with lawyers—and often better. And it’s only been three years.
Jen Leonard: Right. Sure, we can dig into details—Harvey was 75.1% accurate at data extraction, CoCounsel was 73.2%, etc. But the real headline is: Lawyers are not perfect. That seems to be a surprising revelation for some people.
It’s funny how we’ve assumed perfection in human legal work for so long. These tools are closing the gap fast—and in some cases, they’ve already surpassed us.
There were a few limitations noted. Some tools struggled with complex legal judgment. Hallucinations still happened, and performance varied based on document type. The AIs were strongest with structured, text-based tasks—not open-ended research. But creativity? AI already shines there.
Jen Leonard: Yes—and the commentary afterward was great. Jason Barnwell from Microsoft shared one of my favorite takes:
“Flawed premises we’re too comfortable with include assuming human work is flawless; assuming accuracy must be weighed against time and cost; and assuming our choices are binary—human or machine intelligence.”
He’s right. A client may prefer 95% accuracy in five minutes over 100% in 20 hours. And we’re already working in hybrid mode, combining AI and human intelligence whether we admit it or not.
That’s the thing—I wonder how long we’ll even bother benchmarking at all. It matters now, especially for organizations like AAA that need to show transparency and accountability. But in five years? It might feel irrelevant.
Bridget McCormack: Totally. I was speaking with an investor here at Legal Week who said when they’re selling a business, they don’t care if a dispute is resolved perfectly—they just want it done. “Tell me what we owe so we can move forward.” And that hit me: maybe accuracy isn’t always the point.
Jen Leonard: That’s another one for our What Is the Point? segment. I remember a lawyer once bragging that clients hire him for his “edge” in writing. I thought, your clients don’t want an edge—they want the issue solved. They want to move on. Lawyers often see themselves as the centerpiece when sometimes, we’re just the bottleneck.
Bridget McCormack: But while we’re in this early phase—where the AI is still shaped like a “horseless carriage”—benchmarking helps build confidence. Once the tech is fully integrated and normalized, the need to compare might fade.
Jen Leonard: Also, my favorite part of the study was that chart comparing accuracy and response length in a data extraction task. AI responses averaged 2,000 to 3,000 characters. Lawyers? Nearly 16,000. We say the same thing—but in eight times as many words.
Bridget McCormack: Right? It takes real skill to be concise. The only thing I love more than a two-word sentence is a one-word sentence.
Jen Leonard: This study was a great reminder: the tools are improving fast—but the humans? Not so much. And maybe that’s the real insight here. It tells us as much about ourselves as it does about AI.
You’re the one who always says: for a profession obsessed with evidence and proof, we don’t study what we do—or whether it works. We don’t even measure our own effectiveness.
Bridget McCormack: We really don’t. But we should.
Jen Leonard: Thanks for tuning in to 2030 Vision: AI and the Future of Law. If you enjoyed this episode, please subscribe and leave a review. Stay informed and inspired as we continue to explore the AI-driven future of the legal profession. See you next time.