Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I think AI is just a massive force multiplier. If your codebase has bad foundation and going in the wrong direction with lots of hacks, it will just write code which mirrors the existing style... And you get exactly was OP is suggesting.

If however, your code foundations are good and highly consistent and never allow hacks, then the AI will maintain that clean style and it becomes shockingly good; in this case, the prompting barely even matters. The code foundation is everything.

But I understand why a lot of people are still having a poor experience. Most codebases are bad. They work (within very rigid constraints, in very specific environments) but they're unmaintainable and very difficult to extend; require hacks on top of hacks. Each new feature essentially requires a minor or major refactoring; requiring more and more scattered code changes as everything is interdependent (tight coupling, low cohesion). Productivity just grinds to a slow crawl and you need 100 engineers to do what previously could have been done with just 1. This is not a new effect. It's just much more obvious now with AI.

I've been saying this for years but I think too few engineers had actually built complex projects on their own to understand this effect. There's a parallel with building architecture; you are constrained by the foundation of the building. If you designed the foundation for a regular single storey house, you can't change your mind half-way through the construction process to build a 20-storey skyscraper. That said, if your foundation is good enough to support a 100 storey skyscraper, then you can build almost anything you want on top.

My perspective is if you want to empower people to vibe code, you need to give them really strong foundations to work on top of. There will still be limitations but they'll be able to go much further.

My experience is; the more planning and intelligence goes into the foundation, the less intelligence and planning is required for the actual construction.



The wrinkle is that the AI doesn't have a truly global view, and so it slowly degrades even good structure, especially if run without human feedback and review. But you're right that good structure really helps.


Yet it still fumbles even when limiting context.

Asked it to spot check a simple rate limiter I wrote in TS. Super basic algorithm: let one action through every 250ms at least, sleeping if necessary. It found bogus errors in my code 3 times because it failed to see that I was using a mutex to prevent reentrancy. This was about 12 lines of code in total.

My rubber duck debugging session was insightful only because I had to reason through the lack of understanding on its part and argue with it.


Once you've gone through that, you might want to ask it to codify what it learned from you so you don't have to repeat it next time.


I would love to see that code.


Try again with gpt-5.3-codex xhigh.


Try again with Opus 4.5

Try again with Sonnet 4

Try again with GPT-4.1

Here I thought these things were supposed to be able to handle twelve lines of code, but they just get worse.


The goalposts have been moved so many times that they’re not even on the playing field.


Nahh, just trying to make it concrete. I could instead just ask which model they used instead.


I have to 1000% agree with this. In a large codebase they also miss stuff. Actually, even at 10kloc the problems beging, UNLESS youre code is perfectly designed.

But which codebase is perfect, really?


AGENTS.md is for that global view.


The 'global view' doc should be in DESIGN.md so that humans know to look for it there, and AGENTS.md should point to it. Similar for other concerns. Unless something really is solely of interest to robots, it shoudn't live directly in AGENTS.md AIUI.


You can't possibly cram everything into AGENTS, also LLMs still do not perfectly give the same weight to all of its context, ie. it still ignores instructions.


Am I stupid or do these agents regularly not read what’s in the agents.md file?


More recent models are better at reading and obeying constraints in AGENTS.md/CLAUDE.md.

GPT-5.2-Codex did a bad job of obeying my more detailed AGENTS.md files but GPT-5.3-Codex very evidently follows it well.


Perhaps I’m not using the latest and greatest in terms of models. I tend to avoid using tools that require excessive customization like this.

I find it infinitely frustrating to attempt to make these piece of shit “agents” do basic things like running the unit/integrations tests after making changes.


Opus 4.5 successfully ignored the first line of my CLAUDE.md file last week


Thank god it’s not just me. It really makes me feel insane reading some of the commentary online.


Each agent uses a different file, like claude.md etc (maybe you already knew that).

And it requires a bit of prompt engineering like using caps for some stuff (ALWAYS), etc.


You’re not stupid. But the agents.md file is just an md file at the end of the day.

We’ve been acting as if it’s assembly code that the agents execute without question or confusion, but it’s just some more text.


That’s not what Claude and Codex put there when you ask them to init it. Also, the global view is most definitely bigger than their tiny, loremipsum-on-steroids, context so what do you do then?


You know you can put anything there, not just what they init, right? And you can reference other doc files.

I should probably stop commenting on AI posts because when I try to help others get the most out of agents I usually just get down voted like now. People want to hate on AI, not learn how to use it.


its still not truly global but that seems like a bit pie in the sky.

people still do useful work without a global view, and there's still a human in the loop witth the same ole amount of global view as they ever had.


I agree completely.

I just did my first “AI native coding project”. Both because for now I haven’t run into any quotas using Codex CLI with my $20/month ChatGPT subscription and the company just gave everyone an $800/month Claude allowance.

Before I even started the implementation I:

1. Put the initial sales contract with the business requirements.

2. Notes I got from talking to sales

3. The transcript of the initial discovery calls

4. My design diagrams that were well labeled (cloud architecture and what each lambda does)

5. The transcript of the design review and my explanations and answering questions.

6. My ChatGPT assisted breakdown of the Epics/stories and tasks I had to do for the PMO

I then told ChatGPT to give a detailed breakdown of everything during the session as Markdown

That was the start of my AGENTS.md file.

While working through everything task by task and having Codex/Claude code do the coding, I told it to update a separate md file with what it did and when I told it to do something differently and why.

Any developer coming in after me will have complete context of the project from the first git init and they and the agents will know the why behind every decision that was made.

Can you say that about any project that was done before GenAI?


> Can you say that about any project that was done before GenAI?

… a project with a decomposition of top level tasks, minutes and meeting notes, a transcript, initial diagrams, a bunch of loose transcripts on soon to be outdated assumptions and design, and then a soon-to-be-outdated living and constantly modified AGENT file that will be to some extent added to some context and to some extent ignored and to some extent lie about whether it was consulted (and then to some extent lie more about if it was then followed)? Hard yes.

I have absolutely seen far better initial project setups that are more complete, more focused, more holistically captured, and more utilitarian for the forthcoming evolution of design and system.

Lots of places have comparable design foundations as mandatory, and in some well-worn government IT processes I’m aware of the point being described is a couple man-months or man-years of actual specification away from initial approval for development.

Anyone using issue tracking will have better, searchable, tracking of “why”, and plenty of orgs mandate that from day 1. Those orgs likely are tracking contracts separately too — that kind of information is a bit special to have in a git repo that may have a long exciting life of sharing.

Subversion, JIRA, and basic CRM setups all predate GPTs public launch.


> soon to be outdated assumptions

Wild assumption. Having docs and code in step has never been easier.

> soon-to-be-outdated living and constantly modified AGENT file

Quite contradictory.

> I have absolutely seen far better initial project setups that are more complete, more focused, more holistically captured, and more utilitarian for the forthcoming evolution of design and system.

From a single dev, in a day's work? I call massive bs on this.


Absolutely no developer is going to search through issue trackers. Are you comparing that to while you are actually in your terminal telling the agent at to update the file with what you are doing and why?

How many developers actually want to ruin their flow and use a bloated CRM or Jira that has some type of inane workflow set up by the PMO compared to just staying in the terminal.

If there is any change to the initial contract, there is change order - you put that through the same workflow.

And do you really want to use how the government works as the model of efficiency? No, this is coming from a right wing government hater or libertarian that says we don’t need government. But I’ve worked in the pub sec department of consulting (AWS ProServe WWPS).


That sounds really powerful, but also like burden shifts to the people that will maintain all this stuff after you're done having your fun.

Tbh, I'm not exactly knocking it, it makes sense that leads are responsible for the architecture. I just worry that those leads having 100x influence is not default a good thing.


My thought is that the markdown is the code and that Claude code/Codex is the “compiler”.

The design was done by me. The modularity, etc.

I tested for scalability, I checked the IAM permissions for security and I designed the locking mechanism and concurrency controls (which had a bug in it that was found by ChatGPT in thinking mode),


> Can you say that about any project that was done before GenAI?

yes. the linux kernel and it's extensive mailing lists come to mind. in fact, any decent project which was/is built in a remote-only scenario tends to have extensive documentation along these lines, something like gitlab comes to mind there.

personally i've included design documents with extensive notes, contracts, meeting summaries etc etc in our docs area / repo hosting at $PREVIOUS_COMPANY. only thing from your list we didn't have was transcripts because they're often less useful than a summary of "this is what we actually decided and why". edit -- there were some video/meeting audio recordings we kept around though. at least one was a tutoring session i did.

maybe this is the first time you've felt able to do something like this in a short amount of time because of these GenAI tools? i don't know your story. but i was doing a lot of this by hand before GenAI. it took time, energy and effort to do. but your project is definitely not the first to have this level of detailed contextual information associated with it. i will, however, concede that these tools can make it it easier/faster to get there.


Well, I was developing as a hobby for 10 years starting with an Apple //e in 65C02 assembly language before graduating from college…if that gives you a clue to my age and I am old enough that I am eligible to put catch up contributions in my 401K…

If I had to scope this project before GenAI it would have taken two other developers to do the work I mentioned not to mention make changes to a web front end that another developer did for another client on a project I was leading - I haven’t touched front end code for over a decade


This is what I’ve discovered as well. I’ve been working on refactoring a massive hunk of really poor quality contractor code, and Codex originally made poor and very local fixes/changes.

After rearchitecting the foundations (dumping bootstrap, building easy-to-use form fields, fixing hardcoded role references 1,2,3…, consolidating typescript types, etc.) it makes much better choices without needing specific guidance.

Codex/Claude Code won’t solve all your problems though. You really need to take some time to understand the codebase and fixing the core abstractions before you set it loose. Otherwise, it just stacks garbage on garbage and gets stuck patching and won’t actually fix the core issues unless instructed.


A tangent, I keep hearing this good base, but I've never seen one, not in the real world.

No projects, unless it's only you working on it, only yourself as the client, and is so rigid in it's scope, it's frankly useless, will have this mythical base. Over time the needs change, there's no sticking to the plan. Often it's a change that requires rethinking a major part. What we loathe as tight coupling was just efficient code with the original requirements. Then it becomes a time/opportunity cost vs quality loss comparison. Time and opportunity always wins. Why?

Because we live in a world run by humans, who are messy and never sticks to the plan. Our real world systems (bureaucracy , government process, the list goes on) are never fully automated and always leaves gaps for humans to intervene. There's always a special case, an exception.

Perfectly architected code vs code that does the thing have no real world difference. Long term maintainability? Your code doesn't run in a vaccum, it depends on other things, it's output is depended on by other things. Change is real, entropy is real. Even you yourself, you perfect programmer who writes perfect code will succumb eventually and think back on all this with regret. Because you yourself had to choose between time/opportunity vs your ideals and you chose wrong.

Thanks for reading my blog-in-hn comment.


It’s not about perfectly architected code. It’s more about code that is factored in such a way that you can extend/tweak it without needing to keep the whole of the system in your head at all times.

It’s fascinating watching the sudden resurgence of interest in software architecture after people are finding it helps LLMs move quickly. It has been similarly beneficial for humans as well. It’s not rocket science. It got maligned because it couldn’t be reduced to an npm package/discrete process that anyone could follow.


Very well put.

I've always been interested in software architecture and upon graduating from university, I was shocked to see the 'Software Architect' title disappear. Software devs have been treating software architecture like phrenology or reading tea leaves.

But those who kept learning and refining their architecture skills during this time look at software very differently.

It's not like the industry has been making small, non-obvious mistakes; they've veen making massive, glaringly obvious mistakes! Anticipating a reasonable range of future requirements in your code and adhering to the basic principles of high-cohesion and loose-coupling is really not that hard.

I'm taken aback whenever I hear someone treating software architecture as some elusive quest akin to 'finding Bigfoot'.


Well-architected code should actually be easy to change wrt. new requirements. The point of keeping the architecture clean while you do this (which will typically require refactoring) is to make future changes similarly viable. In a world run by messy humans, accumulating technical debt is even more of a liability.


A important point though is that llm code generation changes that tradeoff. The time/opportunity cost goes way down while the productivity penalty starts accumulating very fast. Outcomes can diverge very quickly.


> No projects, unless it's only you working on it, only yourself as the client, and is so rigid in it's scope, it's frankly useless, will have this mythical base.

This is naive. I've been building an EMR in the healthcare space for 5 years now as part of an actual provider. We've incrementally released small chunks when they're ready. The codebase I've built is the most consistent codebase I've ever been a part of.

It's bureaucracy AND government process AND constantly changing priorities and regulations and requirements from insurance providers all wrapped up into one. And as such, we have to take our time.

Go and tell the clinicians currently using it that it's not useful. I'm sure they won't agree.

> Perfectly architected code vs code that does the thing have no real world difference

This just flat out isn't true. Just because YOU haven't experience it (and I think you're quite frankly telling on yourself with this) doesn't mean it doesn't exist at all.

> Because you yourself had to choose between time/opportunity vs your ideals and you chose wrong.

Like I said above, you're telling on yourself. I'm not saying I've never been in this situation, but I am saying that it's not the only way to build software.


Lesson learned. Yes you are right. I am indeed a junior, I made that comment when I was tired honestly with a rushed project. There's no delete button, otherwise I'd have deleted it when I cooled off. Thank you for giving me hope that good code is still being made.


> Thank you for giving me hope that good code is still being made.

So I've been on both sides, and it's why I responded. While you are absolutely correct that those situations do exist, I just wanted to point out it's not always that way. And I felt exactly as you did about software in general until I finally found a place or two that wasn't just a cash printing machine.

And it's pretty awesome. I've come to realize burnout is less about the amount of hours you put in and more about what you're doing during those hours.

It's tough, especially in the beginning. Push through it. Get some experience that allows you to be a bit more selective in what you choose, and fingers-crossed you'll find yourself in the same spot. One common denominator in all of the good jobs I've had was that the leadership in those companies (3 of them) were all tech-focused. Could be a coincidence, but it's a pattern I've seen.


This does not track with my experience, trying agents out in a ~100K LOC codebase written exclusively by me. I can't tell you whether nor not it has a good foundation by your standards, but I find the outputs to be tasteless, and there should be more than enough context for what the style of the code is.

Given how adamant some people I respect a lot are about how good these models are, I was frankly shocked to see SOA models do transformations like

  BEFORE:
    // 20 lines

  AFTER
    if (something)
        // the 20 lines
    else
        // the same 20 lines, one boolean changed in the middle
When I point this out, it extracts said 20 lines into a function that takes in the entire context used in the block as arguments:

  AFTER 2:
    if (something)
       function_that_will_never_be_used_anywhere_else(a, b, c, &d, &e, &f, true);
    else
       function_that_will_never_be_used_anywhere_else(a, b, c, &d, &e, &f, false);
It also tends to add these comments that don't document anything, but rather just describe the latest change it did to the code:

  // Extracted repeating code into a function:
  void function_that_will_never_be_used_anywhere_else(...) {
      ...
  }
and to top it off it has the audacity to tell me "The code is much cleaner now. Happy building! (rocketship emoji)"


And what if the foundation was made by the AI itself? What’s the excuse then?


Then you are boned unless it was architected well. LLMs tend to stack a lot of complexity at local scopes, especially if the neighboring pages are also built poorly.

E.g pumping out a ton of logic to convert one data structure to another. Like a poorly structured form with random form control names that don’t match to the DTO. Or single properties for each form control which are then individually plugged into the request DTO.


> Then you are boned

Must be my lucky day! Too bad my dream of being that while the bots are taking care of the coding is still sort of fiction.

I love a future when this is possible but what we have today is more of a proof of concept. A transformative leap is required for this technology before it can be as useful as advertised.


Yep, it’s still a bit off from being a true developer. But good news for existing software devs who will need to be hired to fix LLM balls of mud that will inevitably fall apart.

In my mind it’s not too much different than cheap contractor code that I already have to deal with on a regular basis…


you could also use some code styling agent scripts that make todo lists of everywhere where there's bad architecture, and have it run through fixing those issues until its to your liking.

theyre reasomable audit tools for finding issues, if you have ways to make sure they dont give up early, and you force them to output proof of what they did


And that is harder than just doing it manually, hence saying that hard parts are harder. If you have a clear picture of what you want it to do then its harder to vibe code than to code it yourself.


Your responsibility as a developer in this new world is design and validation.

A poor foundation is a design problem. Throw it away and start again.


We’ve always been responsible for design and validation. Nothing has changed there.

It’s funny how the vibe coding story insists we shouldn’t look at the code details but when it’s pointed out the bots can’t deal with a “messy” (but validated) foundation, the story changes that we have to refactor that.


But how will new developers learn to design and validate in the future?


Can the AI help with refactoring a poor codebase? Can it at least provide good suggestions for improvement if asked to broadly survey a design that happens to be substandard? Most codebases are quite bad as you say, so this is a rather critical area.


When you say multiplier, what kind of number are you talking about. Like what multiple of features shipped that don't require immediate fixes have you experienced.


It's coding at 10-20x speed, but tangibly this is at 1.5-2x the overall productivity. The coding speed up doesn't translate completely to overall velocity yet.

I am beginning to build a high degree of trust in the code Claude emits. I'm having to step in with corrections less and less, and it's single shotting entire modules 500-1k LOC, multiple files touched, without any trouble.

It can understand how frontend API translates to middleware, internal API service calls, and database queries (with a high degree of schema understanding, including joins).

(This is in a Rust/Actix/Sqlx/Typescript/nx monorepo, fwiw.)


Okay but again what multiplier of features have you actually shipped.


my exact experience, and AI is especially fragile when you are starting new project from scratch.

Right know I'm building NNTP client for macOS (with AppKit), because why not, and initially I had to very carefully plan and prompt what AI has to do, otherwise it would go insane (integration tests are must).

Right know I have read-only mode ready and its very easy to build stuff on top of it.

Also, I had to provide a lot of SKILLS to GPT5.3


how do you know there is such thing as good code foundations, and how do you know you have it? this is an argument from ego


Induction always sneaks in!


socketcluster nailed it. I've seen this firsthand — the same agent produces clean output when the codebase has typed specs and a manifest, and produces garbage when it's navigating tribal knowledge. The hard part was always there. Agents just can't hide it like humans can.


AI doesn't fix design debt, it amplifies it




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: