Vibe coding is now part of our vocabulary. Andrej Karpathy's tongue-in-cheek term for AI-assisted experimentation has escaped the lab and become mainstream shorthand for "just ask AI and ship it".

The demos look amazing. "I built this tool in one afternoon with [Claude, Lovable, Replit]." And the obvious follow-up: why can't we just build everything this way?

Because what Karpathy described as a mindset for experimental weekend projects has morphed into a perceived shortcut for professional software development. That conflation is dangerous — and understanding why matters for anyone building legal tools with AI assistance.

What "Vibe Coding" Actually Means (And Why Karpathy Named It That)

The term comes from Andrej Karpathy's post on X about how he uses AI coding assistants:

Andrej Karpathy Vibe Coding Tweet

His definition is specific and intentional:

"you fully give in to the vibes. [...] and forget that the code even exists."

This isn't a workflow. It's a mindset for low-stakes experimentation. You describe what you want. The AI writes code. You run it. If it works, great. If it doesn't, you try again. You're not reviewing the code line by line. You're not writing tests. You're not planning architecture. You're vibing.

Karpathy coined this term for weekend projects. One-shot experiments. Throwaway prototypes. It's for when the result matters more than the process; when you want to validate an idea quickly, test a hypothesis cheaply, or communicate a concept with a working demo instead of a slide deck.

And in that context, it's perfect.

A lawyer with a workflow problem can vibe code a quick demo in an afternoon. Show it to colleagues: "Here's what I'm thinking, this tool could automate the tedious parts of our client intake process." Use it to communicate ideas that would take hours to explain in prose.

The value isn't the code. It's the conversation the prototype enables. Can we solve this problem with automation? Is this approach worth investing in? Does this actually address our workflow pain points?

This is tool-assisted sketching, not software engineering. And that's fine! Sketches are valuable. But you don't show sketches to clients and call them tools.

Why Vibe Coding Breaks Down for Production Legal Tools

So why can't we just vibe code production tools if AI can generate working code in minutes?

Because production software has requirements that prototypes don't. And AI coding assistants — brilliant as they are — don't automatically address those requirements unless someone with engineering judgment is steering them.

Security and Authentication: The Tea-App Warning

Last year, a startup called Tea suffered a massive data breach, exposing users' private messages, location data, and personal information. The founder had built the MVP using AI coding assistants with approximately six months of programming experience, and the authentication implementation had catastrophic flaws that someone with security expertise would have caught immediately.

For legal tools handling client data, confidential communications, or sensitive business information, this isn't just a bug. It's a career-limiting mistake that no professional insurance policy will cover.

Maintainability: The Technical Debt Problem

AI coding agents generate code that works today. Without engineering discipline, that code becomes unmaintainable tomorrow. Functions are named inconsistently. Logic is duplicated across files. There's no clear architecture. Edge cases aren't handled. Error messages are vague or missing entirely.

The consequence: When you need to add features, fix bugs, or integrate with other systems, you're building on quicksand. What took 2 hours to vibe code takes 20 hours to refactor properly later — if you can even figure out what the original code was trying to do.

Domain expertise helps you spot legal errors in AI output. Engineering expertise helps you spot technical ones. Both matter. Vibe coding assumes you only need the first.

Scale and Edge Cases: When Demos Meet Reality

Your vibe-coded prototype processes 10 test documents beautifully. Then you run it on 1,000 real documents and it crashes, produces garbage output, or takes 6 hours to finish.

Or worse: it appears to have worked, but silently skipped files with unexpected formatting, mishandled Unicode characters in entity names, or produced subtly wrong results that you won't catch right away.

Without proper logging, observability, and error handling, you're flying blind. You don't know why it failed. You can't fix it systematically. You're back to trial-and-error debugging at scale, with production data.

Testing: The Non-Negotiable Guardrail

Professional development requires automated testing. Tests define expected behavior. Tests catch regressions when you change code. Tests validate that edge cases are handled correctly.

Vibe coding usually skips this entirely. "It works when I run it" becomes the test suite which is fine when it's a throwaway experiment, but dangerous when colleagues start depending on it.

Here's the thing: with AI coding agents, there's no excuse not to write tests. The AI can generate test scaffolding instantly. It can write basic test cases. It can even suggest edge cases you might have missed.

But someone needs to review those tests. Someone needs to add coverage for domain-specific scenarios. Someone needs to ensure tests actually validate the right things, not just "does this function return something?" but "does this correctly handle malformed dates in tax forms?"

I can't count how many times tests have saved me from AI-generated bugs. The AI modifies a function. The tests fail. I catch the problem before it reaches production. Or worse: the AI modifies existing tests to make buggy code pass — and I catch it during code review because I actually read what the AI changed.

The Shift: From Coding to Orchestration

Here's what changes when you move from vibe coding to professional development with AI assistance:

AI handles syntax, boilerplate, and repetitive implementation. You handle architecture, domain logic, edge case identification, and quality validation.

The work shifts from manual coding to what I think of as a very weird form of management. You're managing an extremely capable but occasionally overconfident developer who needs clear direction, produces work at incredible speed, and requires careful review.

This is where domain expertise becomes your multiplier.

A Workflow That Actually Works

Here's how I use AI in my development workflow. When I spot something AI-generated that I don't fully understand, I ask the AI to explain it — which makes it not just a coding tool but a learning resource. For deeper architectural questions, I'll use research tools like Perplexity to understand tradeoffs before committing to an approach.

1. Design before code

I explain the feature I want to build. If I have architectural ideas - API design, database schema, data flow - I specify those upfront. The AI doesn't guess the structure. I define it.

Sometimes I don't have a clear design in mind, though. In those cases, I use AI as a brainstorming partner: "Here's the problem I'm trying to solve. What are some architectural approaches I should consider?" The AI might suggest patterns I haven't encountered before, and that's when I'll research them to understand the implications.

This is where legal domain knowledge matters most. What data do we need to capture? What edge cases exist in our jurisdiction? What validation rules apply?

2. Implementation planning

If I'm working with an existing codebase, I ask the AI to analyze it and draft an implementation plan. Not "write the code", but "here's what needs to change, here's where new code fits, here's what could break."

For greenfield projects, we skip the code analysis and focus on defining the structure: what components are needed, how they interact, what the data flow should be.

We iterate on the plan before writing code. Does this approach handle edge cases? Does it integrate cleanly with existing systems? Is there a simpler design? Would this scale if we 10x the data volume?

This is exactly what partners (should) do in a law firm: review a junior associate's approach before they spend days executing it.

3. Test-driven guardrails

I define tests first, or immediately after seeing the AI's first implementation. Tests are the boundaries within which the AI operates.

The AI generates tests. I review the tests meticulously. I add tests for edge cases the AI missed. I ensure tests validate domain logic, not just technical functionality.

4. Code review and validation

When the AI generates implementation code, the modern coding agents often review their own work first - running tests, checking for obvious issues. But I still review the output myself. Not line-by-line syntax checking (the AI handles that), but:

  • Does this match the agreed design?
  • Are there obvious performance issues?
  • Are error cases handled properly?
  • Would I be comfortable explaining this code to another developer?
  • Can I articulate why this approach was chosen over alternatives?

If something doesn't make sense, I ask the AI to explain its reasoning. Sometimes that reveals a clever optimization I wouldn't have thought of. Sometimes it reveals a misunderstanding that needs correction.

The Skill Shift: Syntax vs. Systems

You might not need to memorize syntax anymore. dict.get() vs dict[]? Ask the AI. Pandas DataFrame methods? The AI knows them all.

But you absolutely need to understand:

  • System design: How do components interact? What happens when things fail? Where are the bottlenecks?
  • Common patterns: What's a reasonable way to structure this? What patterns lead to maintainability nightmares?
  • Security best practices: What are common vulnerabilities? How should authentication work? What data needs encryption?
  • Performance tradeoffs: Will this approach scale? What's the cost of this design decision? When does "good enough" become "actively harmful at scale"?

What I've described here is not vibe coding. This is something else entirely: professional software development adapted for an AI-assisted world. Several people have proposed names for this workflow ("AI-assisted coding", "vibe engineering"), but one term appears to stuck: "agentic engineering." Who coined it? Andrej Karpathy, of course!

A year after coining "vibe coding," Karpathy acknowledged that the landscape had shifted. Programming via LLM agents is becoming the default workflow for professionals, but crucially, he noted this happens "with more oversight and scrutiny." He proposed agentic engineering to capture this professional approach.

"Agentic" because you're not writing code directly 99% of the time; you're orchestrating agents who do the work while acting as oversight.

"Engineering" to emphasize that there is art, science, and expertise to it. It's something you can learn and become better at, with its own depth of a different kind:

Andrej Karpathy on Agentic Engineering

"But AI Wrote It" Isn't a Defense

The same way agentic engineering must be guided by people who understand engineering principles, legal AI must be guided by people who understand law.

Think about it: You wouldn't let a legally untrained person draft high-stakes client contracts with ChatGPT or Claude, even if the output appeared polished. Sophisticated language. Proper formatting. Standard-looking clauses. But they lack the domain expertise to evaluate the output. They can't spot when the AI hallucinates a legal standard. They can't identify missing liability protections. They can't assess whether the language achieves the client's goals. They can't evaluate whether terms are enforceable in the relevant jurisdiction.

The same principle applies to software handling legal workflows.

A non-engineer can vibe code an app that functions in demos. But they can't evaluate whether it's secure, maintainable, or scalable. They can't spot architectural decisions that will cause problems later. They can't assess the quality of code they're responsible for deploying.

"But AI wrote it" isn't a defense when the tool produces subtly wrong results that affect legal advice. If you (aim to) deploy software professionally - handling client data, automating legal workflows, integrating with firm systems - you have a responsibility to ensure it meets professional standards.

The Opportunity (And the Warning)

AI coding agents genuinely lower the barrier to building custom legal tools. But "lower barrier" doesn't mean "no barrier."

Law firms that understand the difference between prototyping and production will thrive: empower lawyers to explore automation through vibe coding, then channel promising ideas into professionally-built tools. The two modes aren't in conflict — they're a pipeline.

The misconception isn't that vibe coding exists. It's that vibe coding is appropriate for professional software development. Karpathy was explicit: it's a mindset for low-stakes experiments, not a development methodology for production systems.

Legal practice demands accountability. Accountable software requires more than vibes. It requires engineering.