Working with ChatGPT

December 13, 2022

I've been an avid user of GitHub Copilot since it came out, and it has transformed how I write programs. I now am able to spend quite a bit of time on nice error messages, covering all edge cases with unit tests, devising “complete” APIs, and doing proper error handling.

I use Copilot for basically two use cases:
– generate code I have already written in my head
– fish for API functions I know exist but can't remember the name of. This often leads to discovering patterns I didn't yet know of

I do tend to turn copilot off when I have “real” programming to do because it tends to generate nonsense if it doesn't already have 2 or 3 base cases to latch onto.

I do sometimes use it as an idea generator, writing a little line of text and waiting for it to suggest some function names or structures. I've found good ideas that way, for example, suggested serialization structures that had more fields than I would have thought of adding, or completing accessor functions that I didn't fully think through.

ChatGPT is on another level

ChatGPT is on another level. I had planned to use it for the day to see what I could get out of it, and I have used it to generate the following:
– a list of concrete instances of structures that test edge cases of an import function
– a set of terraform resources to deploy a cronjob on AWS Fargate
– a makefile to build and push a docker image to an AWS registry
– a PHP script to collect and zip a set of resource files, naming the zip file according to the timestamp and git hash and branch, and pushing it to an s3 bucket
– a set of next.js endpoints to implement an ecommerce cart functionality

None of this is code that is in any way complicated to write, and I had to redirect GPT from time to time, but the fact that I was able to build all of the above in about an hour's time is absolutely mindboggling. Not only did it generate reasonable answers, but it did a stellar job of documenting what it was doing.

It means I can fully focus on intent, testing, documentation, developer UX, unit testing, and the “complex” part of my job, without having to hunt down terraform documentation or copy-pasting some ECR Makefile magic.

The discourse around ChatGPT and Copilot is weird

I think most of the discussion around ChatGPT and Copilot is disconnected from the experience of actually using it. I'm talking about:
– the danger of them producing broken code,
– that you can induce it to regurgitate complete functions verbatim,
– that it will make us stupid and careless

Both ChatGPT and Copilot confidently generate a lot of wrong or nonsensical suggestions. So does my IDE auto-completion. I found that anytime I dare Copilot to generate something I don't already know how to write, or something I have already written but with different arguments, it will generate something sensible on the surface but wrong. The effort it takes me to code review Copilot's suggestions is mentally more taxing than writing it myself, and after a while, I started to “feel” what Copilot will do well and what it won't.

Will people get bitten by blindly accepting Copilot's or ChatGPT's suggestions? Sure. If you care about your application, you will quickly notice that things are not working. Poor programmers can write broken code entirely without machine help. Good programmers will quickly realize that even the best code, machine-generated or not, will need review, testing, and validation.

Solving problems is about more than coding a solution

More importantly, you need to already have a good understanding of the solution to your actual problem to get a usable output.

Prompts that ChatGPT and to some extent Copilot do well on, like:
– I need a next.js handler that takes a JSON structure X and stores it in the table Y in MySQL
– Write a makefile to build a docker image that runs mysqldump on X and uploads the result on Y

require a level of planning and understanding of software architecture that requires “strategical” thinking. These smaller tasks are in themselves extremely abstract, only peripherally related to the real-world problem of “selling plants online” or even “backing up our production database.” These tools are made to work at that level.

If I were to ask ChatGPT “please backup my database,” why would I expect its answer to be any better than one of the hundreds of competent SaaS offerings out there? For its answer to be good, I need to guide it so that its answer fits well into my concrete codebase, team structure, operations, and planning. This is hard work, it requires thinking, communication, prototyping, learning new technologies, and knowing the business, the codebase, project constraints, requirements, and team topologies.

That is exactly what I enjoy doing as a principal engineer: high-level technical strategy, with very sharp decision-making when it comes to concrete problems while giving team members the opportunity to own and shape the result, or a more menial AI the pleasure to fill in the blanks.