"So we've been talking about AI..." - the series, chapter 2 - Coding assistants
Engineering - please hold on, my robot will assist you
I've equipped teams with licenses to various AIs IDEs and simple goals: use them on legacy code, existing and active code and for new products. I wanted to find what is productive practice for a regular team on a general form and what is not.
Our theory that having tools to improve our specs would help more than just generating code but there could be something we could connect.
Having worked with other developer assisting tools I am convinced that CI/CD, a golden path to production, observability and boring stack are the way to go and hardly anything would top that so the assistants would have to blend in.
Agents compose the best part of modern coding assistants, from boilerplate creation to shell scripting and file editing. These agents can be extended and integrated by local scripts and MCP (model context protocol) servers such as Github's MCP. These have not much to do with the quality or deepness of the current model but as their outputs are chained.
The risk beyond common boilerplate tasks are a kind of hallucination not in the sense of the model, not code that your language won't interpret or compile but more like assembly line issues:
Packages and dependencies that does not exists - and can be targeted by malicious agents that run the same agents to infer which common packages it would suggest.
Models are trained on data that has a limit. You probably will have to enhance your prompt and instructions to use the latest version of frameworks, languages and libraries and consequently deal with breaking changes
Looping and refactoring beyond reason: if you encode an image with a lossy codec again and again it start losing attributes and resolution. A fast and unbounded agent can have the same effect. Duplicated utility frameworks, statistically relevant (for its training) code patterns that you wouldn't use and codebase broad refactoring. If you use ORMs, database refactoring, wiping and a cascade of migrations and model refactoring. It is a loop due to the limited token window and lack of context memory: the model fails to identify what seems to be an issue, try another way of solving it, another way of approaching the problem, another error, back again to square one and so on until you find yourself with 3 UI frameworks or a new database.
Narrow token windows and memory loss: as I wrote above, limitation of API calls and token windows make the work of agents very procedural, almost like shell scripts. They execute what is given to them, limited by the token window. Appending local agent instructions, code, memories and prompts consume space in this window along with the coding agent context and instructions. The context is extracted in a sort of most recently used responses and can quickly forget what we are talking about. Groundhog day.
To help mitigate that:
Deploy from the first time you have anything working. Deploying often even to a test environment will establish an end-to-end discipline that prevents big changes specially if you are using any kind of ORM or database migration framework.
Create two local files: memory.md and a changelog.md, add your task or equivalent to product specification to memory.md and instruct the model to update both, one with decisions of the architecture and the other with changes along the planned specification.
Do your work in a git (or any other version control) new branch and tag or commit it every time you manage to get to a sane point. Be a git stash away of a working and deployable app.
Instruct the agent to plan, reason then execute.
Do not go into long streaks of agent coding. Create the habit of using it as a streaming template tool but, and specially if you work with a team, beware of too much unmanaged code created automatically.
Combine tools, like Claude and ChatGPT to help write specs and review parts of code. A model ensemble helps break loops and create better
Scan your dependencies for introduced malware packages
Same for code that may be licensed and included wrongly into the model training.
Clean-up your code, remove non-used paths and libraries. All the time
Intermission: Low Code, Vibe coding and MVPs
Speed of execution gives everyone the possibility to test and deliver quickly. I find it healthy that people other than developers can use Low Code or No Code tools as Lovable to prove their thesis, build frontends, back office platforms and support tools.
They are great for enterprise integrations too, where you want to extend your product for an specific sales case or opportunity and showcase capabilities. For product managers this can help them refine and extend their specification and kill the discovery phase bad reputation by pushing a more palpable version of their ideas.
From zero to something, faster than an App Builder in the 2000s which helped a lot of founders build big companies and create their internal lore of legacy and monoliths. It is a circular and repeatable movement, and each time more refined, fast and interesting. And of course with turbo spaghetti code generation :D
I have released a new book: “Go for Gophers”. It is a progressive, hands on book for teams and engineers that are adopting Go and look for an idiomatic path forward. Quick, practical and loaded with useful examples. Check it out.
If you are a CTO, Tech Lead or Product Leader check The CTO Field Guide. It is a book to help product engineering leaders of all levels. If you need personalized help, check the mentoring program based on the book at https://ctofieldguide.com/mentoring.html.
