ChatGPT and generating Assembly code

Fri, Nov 3, 2023

Why not use a tool if it’s available, right? I’ve been experimenting with ChatGPT for a while and it writes generally abysmal Assembly code for the C64. That’s probably because its training data on this specific language is tiny. Now I would really like to have a generator that does all of the boring stuff discussed in the previous post. Writing reams of repetitive code to parse out individual words is not my idea of fun, but it’s a necessary evil to get this game working. So I recruited the machine to do the machine’s job!

First, some background. Larry has a file called WORDS.TOK in its original release. This contains all of the inputs that the game responds to. I spent some time decoding this file and converting al the word groups and individual words into a YAML format. Here’s a snippet of that:

- wordgroup:
   - balcony
   - fire escape
   - rail
   - railing
- wordgroup:
   - lobby
- wordgroup:
   - abuse
   - play with
- wordgroup:
   - lobby
- wordgroup:
   - honeymoon suite
   - suite
- wordgroup:
   - penthouse
   - penthouse suite

What I want a generator script to do, is the following:

Assign incremental ID’s to each of the `wordgroups'.
Alphabetize all individual words, preserving the ID’s from their groups.
Generate Assembly code from a template that tokenizes all words.

Now I’m a reasonable Rust coder, but this looks much more like a job for a language like Python. The problem? I’ve never written a single line of that in my entire life! Fortunately I know a machine that has, so let’s ask the chatbot!

Breaking up the problem into bits

My experience with ChatGPT is that it works fairly well on problems broken up into smaller steps. You shouldn’t ask it to take over the world for you, but ask it to perform each of the steps and it might just make you the next Emperor Zurg. I digress.

Initially I asked ChatGPT to take my YAML and reshuffle it so that every wordgroup has an ID of its own. It dutifully did so, although the code looks positively baroque to my untrained eye. Are all Python scripts like this? I hope not, but it’s fine anyway.

As long as the script gets the job done, I’m fine with it. Once the game is done, I’m not going to be using the script anymore and I won’t be releasing it to the world either. What I do want to do, is to use it in my CI/CD pipeline so that I can make quick rebuilds of Larry at a moment’s notice and release those to the world when they’re in any sort of playable shape. So: fine if it’s crap, as long as it works.

Keeping ChatGPT on track means I’m taking the output of the first script and using that as input for the next: alphabetizing the words while maintaining their wordgroup ID’s. That also works just fine, after feeding ChatGPT back some of the mistakes it made.

Generating Assembly code

The actual Assembly output is something that I DO care very much about, and which is also quite crummy. I’m writing in KickAssembler but I’ll forgive ChatGPT for not knowing that particular dialect. It’s also a bit wordy and tries to do a lot of thinking for me, while I already have a distinct design with an engine core and scene-specific code being separated. I’ll work around all of that quite easily though. Python is a lot easier to read and modify than it is to write from scratch if you know nothing about it.

So in closing for today I had a lot of help from ChatGPT, knowing full well that the output would be potentially shoddy. It saves me days of writing a generator by hand so I can now get back to actually cleaning up the generated Assembly and integrating that into the game. This is one MASSIVE hurdle eliminated by machines doing what machines do best: boring grunt work. Gotta love computers!