<-- PREVIOUS | NEXT -->

The Time I Tried Making My Own Programming Language

Date: 2023-09-14 23:34

Sometimes, you'll be writing code and decide you want a challenge. I'm the kind of person who will try something I can't do, fail, try some more, and somehow manage it. This strategy doesn't always work, as I proved with the W++ project.

Back in $MONTH, I decided to try my hand at making a programming language. I immediately threw any ideas about writing a compiler into the bin, because writing a compiler is insanely difficult.

Compilers Are Hard

Compilers are made by very smart people with a deep understanding of the language they're writing for, how computers work, and how to handle errors and edge cases. Most compilers also provide build-time optimisations that make the program run faster, which are even harder to implement, as they can introduce instability if done incorrectly. Compiler devs are unilaterally smarter than amateurs like me. Most would say, "Cute. You're trying to write a programming language. You're just an idiot who hasn't experienced the real world. Get back to reality."

Design

First order of business for any language is to design it. I wrote a specification document, which was basically a source code file. You can download the specification here, if you so wish.

This bit was easy, and I ended up making something that looks like Python and C#. Cool.

How To Write a Programming Language

Generally, there are 3 steps to writing a programming language:

I know that there are more ways to do the Final Step, but that's not important.

Writing The Lexer

This was trivially easy. I managed to get an artificial intelligence to do it. Crazy what regular expressions can do.

The Parser

Writing a basic parser is easy. Say you want to define a string:

str x = "3";

There are 4 elements:

What I did was write a Parser script that was fed a series of tokens and checks tokens and moves on to the next one. It meant that if it saw a str token I could make a parseString() method that parses the string. All you have to do is expect a specific series of tokens, and throw an error if you don't get what you expected. A simple expect() method does the trick. Jump to the token you are looking for, collect all the info you need, create some kind of string definition, be it a dictionary or a class. Pretty simple stuff.

Initially, writing the parser was quite simple and I was getting the hang of it. I saw how flexible the system I had set up was, which took a LOT of trial and error to do, and was about to start writing code for parsing functions when I suddenly hit an unexpected but pretty massive snag.

The Issue

Examine this code:

def function() {

    int x = 4;

    str y = "hello";

}

From the perspective of a lexer/parser combo, how do you tell what code is inside the function and what isn't? It's more difficult than you think.

Traditionally, programming languages defined an "end" line or similar that would tell the programming language that the function has ended. However, I didn't want to add this, as it was a pretty janky solution that makes for needlessly complex and ugly code.

As such, the only way to tell if a function has ended is with closing curly braces, which are super ambiguous.

While I was fixing this issue, which I never ended up  fixing, I realised an even bigger issue.

The Bigger Issue

As I tried to fix my function parsing, I began to think. I quickly realised that the language was a lot more complex than I thought. For example, it had to handle function definitions with both parameters and no parameters, statements with either explicit logic or variable references, proper error handling, some way of getting static typing working when I was using Python, a dynamically typed programming language, evaluating variable names, logic, bitwise operations, lists and dictionaries. I also wanted to implement classes which would be a nightmare to sort out, as they had constructors, methods, attributes, subclasses, instance checking, and a lot more. Suddenly, I realised I had bitten off more than I could chew.

And that's why W++ failed. I tried to do too much.

Conclusion

I learnt a lot when working on W++. It taught me a valuable lesson: Be appreciative of existing programming languages and don't try to enter the market for no reason. It's disrespectful to the people who dedicate their time to making existing languages work. There's nothing wrong with them.

Unless it's JavaScript.

Fuck you, Netscape. You did this to us. It's all your fault.