When you write the code yourself you are slowly building up a mental model of how said thing should work. If you end up introducing a subtle bug during that process, at least you already have a good understanding of the code, so it shouldn't be much of an issue to work backwards to find out what assumptions turned out to be incorrect.
But now with Claude, the mental model of how your code works is not in your head, but resides behind a chain of reasoning from Claude Code that you are not privy too. When something breaks, you either have to spend much longer trying to piece together what your agent has made, or to continue throwing Claude at and hope it doesn't spiral into more subtle bugs.
Everybody produces bugs, but Claude is good a producing code that looks like it solves the problem but doesn't. Developers worth working with, grow out of this in a new project. Claude doesn't.
An example I have of this is when I asked Claude to copy a some functionality from a front-end application to a back-end application. It got all of the function signatures right but then hallucinated the contents of the functions. Part of this functionality included a look up map for some values. The new version had entirely hallucinated keys and values, but the values sounded correct if you didn't compare with the original. A human would have literally copied the original lookup map.
I asked claude to help me figure out some statistical calculation in Apple Numbers. It helpfully provided the results of the calculation. I ignored it and implemented it in the spreadsheet and got completely different (correct) results. Claude did help me figure out how to do it correctly though!
> Developers worth working with, grow out of this in a new project. Claude doesn't.
There is no way this is true. People make fewer bugs with time and guidance, but no human makes zero bugs. Also, bugs are not planned; it's always easy to in hindsight say "A human would have literally copied the original lookup map," but every bug has some sort of mistake that is made that is off the status quo. That's why it's a bug.
No, it's broadly true. Also, that's why we have code review and tests, so that it has to pass a couple of filters.
LLMs don't make mistakes like humans make mistakes.
If you're a SWE at my company, I can assume you have a baseline of skill and you tested the code yourself, so I'm trying to look for any edge cases or gaps or whatever that you might have missed. Do you have good enough tests to make both of us feel confident the code does what it appears to do?
With LLMs, I have to treat its code like it's a hostile adversary trying to sneak in subtle backdoors. I can't trust anything to be done honestly.
Sorry, perhaps I should have been clearer. They don't grow completely out of making bugs (although they do tend to make fewer over time), they grow out of making solutions that look right but don't actually solve the problem. This is because they understand the problem space better over time.