Cognitive Debt

After using many claudes non-stop for a month, I feel stupider. Not in a good way - I am not awed by the sheer brilliance of super intelligent coding agents - but in a way where I find myself becoming increasingly lazy, deciding not to put in the cognitive work required to build a computer system of any meaningful complexity, and looking for quick “productivity hits” rather than outcomes that result from slow, meaningful labor. I heard about the term Cognitive Debt from Simon Willison’s blog which led to Margaret-Anne Storey’s article which led to a research paper by Kosmya et al., 2025.

Your brain on ChatGPT (arXiv)#

To explore the cognitive effects of using ChatGPT, the authors chose a narrow task - SAT style essay writing - and created 3 groups of 18 students: those that could use ChatGPT, those that could only use a search engine, and those that had nothing. They then measured the quality of the essays and brain activity using EEGs. Quality is difficult to measure, so they used a few different approaches to triangulate things.

The conclusions of the paper match my lived experience using Claude Code. Quoting directly from the discussion section:

Taken together, the behavioral data revealed that higher levels of neural connectivity and internal content generation in the Brain-only group correlated with stronger memory, greater semantic accuracy, and firmer ownership of written work. Brain-only group, though under greater cognitive load, demonstrated deeper learning outcomes and stronger identity with their output. The Search Engine group displayed moderate internalization, likely balancing effort with outcome. The LLM group, while benefiting from tool efficiency, showed weaker memory traces, reduced self-monitoring, and fragmented authorship.

Computer systems vs. computer programs#

When we were early in our careers, studying computer science, most of our work involved computer programs, usually in the context of homework assignments. We had no burden of maintenance or bug fixing or evolving our programs once the next feature came along.

Going from a student to professional, or even early- to mid- career, required closing the gap from programs to systems. Computer programs, taken together, formed computer systems and gave rise to an immense amount of complexity, that cannot be captured by any single program. In many ways, this reminds me of Stephen Wolfram’s work on cellular automata, and complexity emerging from a simple series of rules (ref: longish article).

Dealing with this complexity requires us to understand the systems that we (or others) have built. I swear by Feynman’s quote: What I have not built, I do not understand. I often do not understand the code that Claude is spitting out, especially in areas where I do not have domain expertise. I suspect that most people moving at the speed of Claude don’t either.

The thing with computer systems is that you don’t see the complexity for a while. The start of any project is easy, and initial velocity when you don’t have users, or bugs, is very fast. Complexity develops rapidly - usually the first 10 users do something wonky with your system that you didn’t anticipate, and now you need to tweak it, add a feature here, move some databases there. The first instinct here might be to just have Claude hammer away at it and fix your system until you get what you need. But you don’t understand this beast, so you cannot critique what Claude did and whether it makes any sense. This is the road to software that is going to be extremely hard to maintain and modify, even with Claude.

I think the term Cognitive debt perfectly captures this feeling. Claude has demolished Technical debt and replaced it with Cognitive debt. I can code with impunity and crush bugs and build stuff that I don’t really understand. To make matters worse, we are gorging on this debt like its candy. We do not understand when and how this debt comes due. Over the past week, I have seen some glimpses.

The clearest indicator is when I have to go back over my own code - either as part of dealing with code review comments or changing something - and realizing that I don’t really understand what’s going on. I didn’t write the code and reading code, no matter how carefully, is just not the same. I end up with no good choices:

Debt payoff: Pseudo re-implement the code to really understand the system. This is time consuming, often more so than implementing the thing the first time around.
More debt: Get claude to fix the problem, without understanding the code. May or may not work, might give you some wonky results.

The clearest indicator that you’re in some sort of debt cycle is when a reasonably simple PR is taking 10 commits to pull together, when from experience you know that if you had implemented it by hand, you’d have done it in half the number of commits. When you start asking claude to fix all comments without really understanding things - you’ve essentially declared bankruptcy.

The way out#

First, we need more research. This whole cognitive debt cycle is an emergent phenomenon. The paper I referenced above is from 2025. I could not find related work in the programming domain that would replicate similar results. If someone enterprising grad student reads my work and does the study, the world would be extremely grateful.

We need to understand what happens to software engineers with over reliance on coding tools, over a span of time, working on a difficult systems. Are they net faster? More productive? Happier? Unlike essays, we have objective measures of code quality - bug rates, security issues, cyclomatic complexity - as well as subjective measures - give the code to a few smart people and let them review it. We need to find out what happens when we over rely on these things.

Second, we don’t need to wait for research to tell us that we get dumber through over reliance on Claude. Learning (output) seems intuitively proportional to the amount of cognitive labor (input). Ownership, satisfaction, and overall velocity seem intuitively proportional as well. We don’t need to mentally flog ourselves for every small thing. However, using claude code is a choice, and we should use it as a tool, in the same way that debt (financial, technical, cognitive) is a tool.

The easy stuff:

When exploring and trying to build quick prototypes, delegating to agents is the way to go. They are fast, you don’t really worry about long term maintenance.
When dealing with bugs that you understand, but are just too lazy to create a test, and write the fix, the agents seem ideal. You know exactly what needs to be done, and if it doesn’t do it, you can fix quickly.
When writing documentation for work that you wrote, it seems ideal. You have to be careful to keep things succinct so you don’t just check in oodles of slop, but with that caveat, there is almost zero harm in generating docs and diagrams that others can consume.

Be careful:

Non-trivial changes in an adjacent domain. I am most comfortable in python / backend / systems layer. I understand the frontend/web layer and can write some code, I absolutely do not understand the frontend / mobile layer. When making a mobile change, I have a few choices: rely on claude and just jam the fix through, or go through the long process of learning mobile. For some fixes, that are reasonable to eyeball and then test, it makes sense to use Claude. However, anything more than that and I’m just creating a lot of work for the code reviewer and I would have better off either punting the work to an expert or going through the painful process of learning mobile.
Complex changes in core domain. I have a reasonable level of expertise in our core domain. Even here, when I completely delegate to Claude, I can feel my brain just turning off. I do not recognize the code that comes out. It takes a lot longer to think through the edge cases that Claude might have missed. Ultimately, I often wish that I’d just done the whole thing from scratch myself because it’s way harder to implement and review.

An unsatisfying conclusion#

I don’t have deep concluding thoughts here. I’m documenting my journey here, my experiences, and my research into trying to understand them. Amusingly, this almost certainly means that I’m going to use Claude code less. I give full license to my colleagues to say “I told you so” or otherwise give me a hard time. I’m going to go back to using tab models more. My PR velocity will almost certainly drop, but my satisfaction will go up, I will have much more faith in the code that I wrote and I will understand the system better.