Claude, Claude Code, and Why AI Is Not Going Anywhere

Let me start with the thing I believe most firmly as a software engineer in 2026: AI is not a trend. It is not the blockchain. It is not a bubble waiting to pop. I have been building with it long enough now to say with confidence that it has permanently changed how software gets made, and engineers who treat it as temporary will find themselves in an uncomfortable place in the next few years.

That is not a prediction borrowed from a tech newsletter. It is what I see in my own work, in the things I can ship now that would have taken me twice as long before, in the bugs I catch earlier, in the architectures I think through more carefully because I have a thinking partner that can keep up.

This post is about Claude specifically, both the web version and Claude Code, what I actually use them for, and an honest accounting of what works and what still frustrates me.

Claude vs Claude Code: Two Different Tools

People conflate these constantly and it causes real confusion. Let me be specific.

Claude (claude.ai) is the web interface. Conversations, documents, reasoning, drafting, brainstorming. It supports Projects, which are persistent workspaces with file attachments and custom instructions. This is where I go when I need to think, not just when I need to type.

Claude Code is the terminal agent. It reads your codebase, runs commands, writes and edits files, and operates autonomously on tasks you give it. It is not a chat interface. It is closer to a junior engineer who can execute in your environment without constant supervision.

The mistake I see most often is people using Claude Code like a chat window, asking it questions instead of giving it tasks, and then concluding it is not that useful. Or using the claude.ai web interface to try to write code across a codebase it cannot actually see. They are different tools for different jobs.

My rough split: claude.ai for understanding, planning, documentation, and reasoning about trade-offs. Claude Code for implementation, refactoring, debugging with real test runs, and anything that requires touching actual files.

AI Is Here. It Is Not Leaving.

I want to say this plainly because I am tired of the hedged takes.

When I look at the trajectory from 2023 to now, I do not see a hype cycle winding down. I see models getting meaningfully better every few months in ways that show up directly in my work. The gap between what Claude could do when I first started using it and what it can do now is not marginal. It is substantial.

More importantly, I see the work itself changing. Tasks that used to consume entire afternoons, adding validation across a directory of files, auditing API routes for missing auth, refactoring a naming convention across a large codebase, I can now delegate these. The overhead of setup and verification is much smaller than the time it saves.

That is an irreversible shift. Once you have built with that kind of leverage, you do not go back to doing everything manually. The muscle memory changes. The way you estimate tasks changes. What you consider reasonable scope for a single day changes.

AI is here to stay not because of hype but because the productivity differential is real and it compounds. Engineers who build the habit of working alongside these tools will consistently do more in the same time, and that gap will not close by ignoring the tools.

The Pros: What AI Actually Gets Right

These are real, experienced first-hand, not borrowed from a product launch announcement:

Speed on well-defined tasks. When I know what I want and I can describe it clearly, Claude Code executes it faster than I would type it myself. Adding a new field to every database model and updating the associated validation and tests is a task that used to take an hour of careful work. Now it takes careful review of what Claude Code did in fifteen minutes.

Better code review, weirdly. Before I open a pull request, I have Claude read the diff and poke holes in it. It catches things I missed, naming inconsistencies, edge cases in conditionals, places where I changed behavior without realizing it. It is not a replacement for a human reviewer, but it has consistently improved what I submit for review.

Architecture thinking partner. When I am working out how to structure something and I am not sure which approach is right, talking it through with Claude in a Project conversation accelerates my thinking. Not because it hands me the answer, but because articulating trade-offs to something that asks follow-up questions forces clearer reasoning.

Documentation that actually gets written. I am honest about the fact that I used to defer documentation constantly. Now I have Claude write the first draft from context while I handle the final pass for accuracy. It is not perfect but the first draft is done, and done is better than never started.

Catching errors earlier. Running extended thinking on a tricky architectural decision has, more than once, surfaced a problem I would have hit during implementation. The upfront cost of a longer response is usually worth it on the decisions that matter.

Learning new territory faster. When I am working with a library or pattern I have not used before, having Claude explain it in the context of my actual codebase, not in some generic example, cuts the ramp-up time significantly.

The Cons: Because This Is Not a Press Release

It drifts on long tasks. Give Claude Code a task that spans thirty or more files and it will start losing track of what it already changed. I have watched it try to create a file it created five minutes earlier, or apply a migration pattern to something it already migrated. Large tasks need to be staged deliberately, not handed over as a single instruction.

It does not know why the code is the way it is. Claude Code reads what exists. It does not know that a specific pattern is there because of a production incident six months ago, or that a function behaves unusually because of a platform limitation that is not documented anywhere. It will confidently simplify things that exist for a reason. Code review on its autonomous output is not optional.

Confidence is not correlated with correctness. The model sounds equally confident when it is right and when it is subtly wrong. The wrong answers do not come with disclaimers. You cannot use the fluency of the response as a signal of accuracy. You have to verify.

Context windows have limits. On very large codebases, Claude Code does not have the whole picture at once. It works from what it can see, and sometimes what it can see is not enough. I have had it make changes based on an incomplete view of the codebase that introduced problems it could not have anticipated.

Vague instructions produce mediocre output. When the task is complex and the instruction is loose, the code it produces can be technically functional but architecturally poor. Getting good results requires writing better instructions, which is itself a skill that takes time to develop.

It can generate plausible-looking nonsense. Especially with third-party APIs and newer libraries, Claude sometimes invents function signatures or configuration options that do not exist. If you are not verifying against real documentation, this will bite you.

Claude Code: Agent Mode in Practice

The thing that separates Claude Code from a code autocomplete tool is the agentic loop. It does not wait to be asked about each step. It acts, reads its own output, and adjusts based on what it finds.

Here is a concrete example. I asked it to add Zod validation to every Mastra tool in a directory with about twelve files:

It read the directory, understood the structure of each tool, wrote the Zod schemas, ran tsc to check if the types matched, found two that did not, fixed them, and then ran the linter. I watched it catch its own type error and fix it in the same session without me pointing anything out.

That loop, act then check output then adjust, is what I mean by agentic. It is closer to delegating a task and reviewing the pull request afterward than it is to using a smarter autocomplete.

Sub-Agents: Parallel Investigation

Claude Code can spawn sub-agents to work through different parts of a task at the same time. This is not something most write-ups cover, but it is visible during large tasks if you watch closely.

When I asked it to audit every API route in the Kemuko codebase for missing authentication checks, it did not read files sequentially. It split the work across multiple investigation threads before synthesizing the findings. The audit was faster and more complete than a sequential read would have produced.

I now use this intentionally for tasks where the investigation phase is the bottleneck:

That kind of question touches many files with no obvious dependency order. Parallel investigation suits it well, and the output is noticeably more thorough.

CLAUDE.md: Project-Level Memory That Actually Works

Every project I work on seriously now has a CLAUDE.md at the root. It is a plain markdown file that Claude Code reads automatically before every task. Think of it as the onboarding document for an AI collaborator that starts fresh every session.

For this portfolio site the file tells it to use pnpm, what the folder structure means, and what commands to run before committing. Small things. Without them Claude Code will confidently do the wrong thing, suggest npm install, create files in the wrong directory, skip linting. With them, it follows project conventions without being reminded.

On Kemuko the CLAUDE.md is longer. It documents which routes are authenticated, where the Mastra tools live, what error handling patterns are standard, and what I explicitly do not want it touching without a second look.

MCP: Giving Claude Code External Context

Model Context Protocol lets Claude Code use tools outside the repository. The GitHub MCP server is the one I use most.

Once configured, Claude Code can read GitHub issues and pull request comments while working in the codebase:

It reads the issue thread, finds the relevant files, and responds with something grounded in the full context, not just the code state but what people observed in production, what workarounds were tried, what the actual error messages said.

Setting it up is a .mcp.json at the project root:

There are MCP servers for Postgres, filesystem access, Slack, Linear, and more. I keep most integrations read-only. The read access alone meaningfully expands what is worth delegating to Claude Code.

Claude.ai Projects: Persistent Context Across Sessions

Claude Code resets between sessions. Claude.ai Projects does not.

A Project in the claude.ai web interface is a persistent workspace where uploaded files and custom instructions are retained across every conversation within it. I maintain a project for Kemuko that holds the database schema and the Mastra agent instruction files. It also has a running document of architectural decisions, the "we tried X and it did not work because Y" knowledge that does not live anywhere in the code.

When I need to think through how a new feature interacts with existing patterns, I go there first. The context is already established. I am not re-explaining the architecture from scratch, I am continuing a conversation that already has the foundation.

The contextless version feels like every conversation starts at zero. The project version feels like talking to someone who actually knows the codebase. The difference is hard to appreciate until you have experienced both.

Extended Thinking: When It Is Worth the Wait

Extended thinking lets Claude reason at length before responding. For most things it is overkill and just adds wait time. But there are categories where the output is qualitatively different:

Non-obvious architectural trade-offs. When I was working out how the Kemuko support agent should handle concurrent requests on the same ticket, the extended-thinking response caught race conditions I had missed and suggested optimistic locking grounded specifically in how the existing tools were structured. The standard response gave me a generic answer.

Debugging where the cause is not obvious from the stack trace. When the error is deep in a third-party library or involves timing, extended thinking follows the causal chain further before offering a guess.

For anything where the answer is mostly "write this code," skip it. You will wait longer for the same result.

Best Practices I Have Actually Settled On

These are not rules I read somewhere. These are habits I developed because not having them caused problems:

Write specific task descriptions, not vague prompts. "Improve the auth code" produces mediocre results. "Audit the authentication middleware for missing token expiry checks and add tests for the edge cases" produces something useful. The more specific the task, the more useful the output.

Stage large tasks explicitly. Do not hand over a task touching thirty files in one go. Break it into explicit phases: schema changes first, then API layer, then tests. You will get better results and catch problems before they compound.

Always review before committing. Nothing Claude Code produces goes into a commit without me reading every changed line. This is not distrust. It is the same standard I hold for any code review. Claude does not have context I have not given it, and the gaps show up in the output.

Use the right tool for the job. Claude.ai for reasoning, architecture, documentation drafts, and thinking through trade-offs. Claude Code for implementation, refactoring, and anything touching actual files.

Maintain CLAUDE.md as the project evolves. When conventions change, update CLAUDE.md. When something breaks because Claude Code did not know about a constraint, add that constraint. It is the cheapest form of project memory that exists.

Push back on confident-sounding wrong answers. If something feels off, say so. "Are you sure this handles the case where X?" produces a second pass that often corrects the first. The model responds well to direct pushback.

Do not use it as a shortcut for understanding. I use Claude to go faster on things I already understand. When Claude produces architecture I do not fully follow, I stop and ask until I do. Shipping code you do not understand is a debt you pay later.

Keep write access in MCP integrations minimal. Read access to GitHub, Postgres, Slack is generally safe. Write access deserves more caution. Start read-only and expand only when there is a clear reason.

My Actual Workflow

This is roughly what my day looks like now:

For anything exploratory or architectural, understanding a new library, working out an approach, writing a spec, I use a claude.ai Project conversation. The persistent context and the ability to attach files makes it the right place for thinking.

For codebase tasks, implementing features, refactoring, debugging with test runs, it is Claude Code in the terminal. Specific task description, watch what it does, review every file it changes.

For production issues where context lives in GitHub, Claude Code with the GitHub MCP server. Read the issue, read the code, propose a fix.

None of this replaced how I think about building things. It changed what I do manually versus what I can delegate. The judgment about what to build, whether a design actually makes sense, whether Claude's output is actually correct, that is still on me.

That is where it should be. AI is a force multiplier, not a substitute for engineering judgment. The engineers who understand that distinction are the ones who get the most out of it, and who will keep getting more out of it as the tools continue to improve.

Which they will. Because AI is not going anywhere.