Previewing Bud(Lang): an early-in-development embedded language written for Rust

ecton

I have reached a stage of my new language, Bud (or budlang), that I wanted to share my journey. For those who don't know who I am, my primary project is BonsaiDb. The BonsaiDb blog hasn't been updated in a few months. The tl;dr is that while I have made promising progress, when the beginning on September rolled around, I needed to take a break. This blog post is my way of bringing this break to a conclusion and share what I've been doing in lieu of working on BonsaiDb.

First, I should start with the main purpose of this post.

Previewing Bud(Lang)

Bud is an early-in-development scripting language written with the goal of offering a completely safe alternative to embedding Lua in your Rust executable. I had two desires of a scripting langauge, and Bud aims to fulfill both goals:

No unsafe code
Ability to budget a script to ensure it doesn't run forever

In looking at the existing scripting languages, I found nearly every language reached for unsafe, and none supported the ability to resume a script after its budget had been exhausted.

How bad is performance? It's actually pretty decent. It's too early in this project for me to focus on a wide variety of benchmarks, so my only benchmark is this naive Fibonacci function written in Bud:

function fibonacci(n)
    if n <= 2
        1
    else
        fibonacci(n - 1) + fibonacci(n - 2)
    end
end

I chose this because it has several basic math operations, some basic logic, and because of its naive recursion implementation, calling fibonacci(35) results in the function being called a whopping 18,454,929 times! This seemed like a good balance for looking at Bud's performance. Translating that naive implementation to various languages, this is how Bud stacks up executing fibonacci(35) on my machine:

Language	Time
Lua	0.49s
Javascript (quickjs)	0.69s
Bud	0.87s
Python	1.21s
GNU Common LISP	6.34s
Rhai (default features)	10.37s

I'm very happy with Bud's performance so far. Another achievement I'm happy with is that Bud has no dependencies. This means outside of Rust's standard library, Bud doesn't depend on or use any unsafe code.

For a project that was started less than a week ago, I'm proud of it's at. Before anyone else gets too excited, however, I should point out that Bud lacks a lot of features. For example, it has no strings, maps, or lists. The mechanisms for exposing Rust logic to Bud are at a proof-of-concept stage.

This post, after all, is more about the journey. Astute readers will notice a bit of a time gap: if I started working on this problem at the beginning of September and this language is less than one week old, where did the rest of the time go?

Well, let me tell you about my other language...

"You should write your own LISP"

Some of the smaller game concepts I have been pondering working on have still had elements of end-users being able to write scripts. I'm primarily interested in working on a multiplayer game that uses BonsaiDb as its database (for dogfooding reasons). For the server to execute that code, I would want it to be truly sandboxed. WASM could be a good option, but it comes with its own downsides:

Interface types aren't supported in the browser. If my plugins have both client-side and server-side components, the browser client will need to evaluate that code as well. While interface types are experimentally supported by various runtimes, they aren't an official standard and aren't supported in the browser. Without interface types, manual serialization must be done.

With an embedded language written in Rust, you can theoretically pass a Rust value into the language without serialization overhead without any special features.
Embedding a WASM runtime is quite significant. Executing a cargo build of an empty project with the wasmtime dependency compiles 184 dependencies and wasmer brings in 164.
You need to design a workflow for building scripts and then loading them. With a scripting language, the raw source can be loaded and executed, greatly simplifying the process.

One concept I liked from Wasmer is it's "gas metering", which allows WASM processes to be paused when they run out of gas, and resume once it they have more gas. This isn't really something I needed for my game concept, but I really like the concept.

For my purposes, Rhai's implementation of budgeting would suffice. Their approach allows custom logic to check how many operations have been performed and raise an error if the task needs to stop. This allows you to protect yourself against scripts that run for too long, but in the process, the executing script is aborted.

While looking to see if I could find any actively developed scripting language that had no unsafe code or ones that had this pause/resume feature, a friend I was coworking with jokingly suggested, "You should write your own LISP!" Little did they know that I had wanted to write one a long time ago, as the data-as-code/code-as-data concept is something I have always found poetic.

Bud take 1: A LISP-like

My initial implementation of Bud was a purely safe LISP-like. Here's how the fibonacci function looked in it:

(function fibonacci(num)
    (if (<= num 2)
        1
        (+ (fibonacci (- num 1)) (fibonacci (- num 2)))
    )
)

I was actually pretty happy with the project. I kept telling myself, "I'm sure it's not that fast, but it should be acceptable." But once I saw how quickly fibonacci slowed to a crawl, I started investigating how other languages performed. With my implementation being incredibly slower than every other language I tried, I went to the profiler.

One of the first optimizations I made was reducing the number of heap allocations needed for the Atom type. A core concept in LISP is S-Expressions. A simple implementation of the Atom and SExpression type could look something like this:

enum Atom {
    Integer(i64),
    SExpression(SExpression),
    ...
}

enum SExpression {
    Nil,
    Single(Box<Atom>),
    Pair(Box<Atom>, Box<Atom>),
}

The SExpression type needs to use Box<Atom> because otherwise Atom would recursively nest itself for the Atom::SExpression value. However, this means nearly every modification to a list requires a heap allocation.

To reduce the number of allocations, I created a new crate that created an object pool out of a shared buffer. Unlike the existing crates I could find, my implementation used a reference counter on the main pool to allow allocations to not require a lifetime. This approach uses a single heap allocation to be shared and reused, preventing a large amount of overhead that boxing values would normally have by skipping the OS allocator. This project only used one line of unsafe.

Before we get into the next boxing problem, we have to take a diversion to understand why async/await suddenly makes an appearance. While thinking about how to implement budgeting/metering, I realized I needed to allow Rust functions to save state and be resumed. For example, if a LISP function called a Rust function which needs to call a function (for example, get a value from a parameter by invoking a function), this could cause the Rust function to need to pause execution since it can't finish the evaluation.

I concluded that I was going to need to allow Rust tasks to store state. While thinking about how this might be structured, it dawned on me that Rust has a built-in mechanism for allowing pausing and resuming of functions: async/await. Under the hood, Rust generates a state machine that allows execution to be paused and resumed at each .await location.

I looked to see if any executor supported this gas/metering concept. So, I wrote one.

budget-executor: Manually metered async tasks

I have archived this project, as it's not part of Bud anymore. However, it's a fully functional, well-tested, runtime-agnostic implementation of a manual budgeting approach. When a future asks to spend budget and none is available, internally the future is paused. The code executing the future will return a type that contains that future and allows resuming the future once you've added more budget.

The project itself tries to minimize allocations to keep its overhead as low as possible. The future being executed isn't boxed, but the futures that are spawned are boxed -- a requirement I've come to realize. This is one of the hidden costs of futures that often gets overlooked.

Sadly, when I brought this into my LISP-like, I realized I still needed to box the futures for multiple reasons:

async_trait's methods work by boxing futures. If you look at the documentation or the generated code, you'll see the return type of the function is actually Pin<Box<dyn Future<Output = ...>>>. If I wanted my Rust code to use async_trait as a way to implement a type that can be invoked, each invocation turns into a heap allocation due to the box.
If I avoid async_trait by using something more akin to tower::Service, I still end up needing to box the type because of my goal of returning futures that could recursively call into the runtime. When the possibility of recursion in futures, Rust requires boxing the futures.

What this ends up meaning is that every function call in my LISP-like became a heap allocation. You can imagine my disappointment when I tested the speed of fibonacci(35) and gave up waiting on the 18 million heap allocations and frees to complete.

So, I invented a new type: PinStack. This crate hasn't been uploaded anywhere. It created a single contiguous allocation to use as in a stack-like fashion. This stack allowed pushing arbitrary values and receiving a PinStackAllocation<T> back, properly aligned for the type. Because the stack ensured only one allocation would ever point to a specific region of the underlying buffer, this was all sound. It was fun to design an efficient way that ensured each value's drop is called when the PinStack is dropped.

Back to my LISP-to-be, the next requirement for using this to box futures is the function signature of Future::poll():

pub trait Future {
    type Output;

    fn poll(self: Pin<&mut Self>, cx: &mut Context<'_>) -> Poll<Self::Output>;
}

Instead of taking &mut self, it takes self: Pin<&mut Self>. Luckily, the type I created was designed for this: because the underlying allocation wouldn't be freed until the allocation was freed and couldn't move by design, I could safely create a Pin<&mut T> to the inner type.

With both of these new types and many hours spent with the profiler, I was able to finally beat GNU Common LISP by about half a second (coming in at 5.82s). But, I really wasn't pleased with the state of things.

Reflecting on my LISP

I was quite pleased with budget-executor, RefPool, and PinStack. These libraries were written specifically to optimize the needs of the LISP interpreter I wrote. Because they have no place in Bud today, I am not going to be releasing any of these. If anyone wished they had any of the projects mentioned, contact me and I'll happily ship over the code I have. All of these projects have some testing and the tests pass under Miri. I don't mind if these projects never see the light of day -- they were fun experiments aimed at solving interesting optimization problems.

What I wasn't happy about is that I had sacrificed on my goals. While I had kept unsafe out of the main project itself, I had introduced three new dependencies with unsafe in them. Is that really any better than the little bit of unsafe that, for example, Rhai has?

To be clear, my reason for avoiding unsafe code is that in general, I find it unecessary. There's a reason I choose to write Rust code: I love how it enables building safe abstractions atop unsafe code. As evidenced by the fun I had working on these projects, I'm am willing to reach for unsafe when it's the right tool for the job. With the example of the executor and creating an alternate to Pin<Box<dyn Future>>, there is literally no way to do these tasks without unsafe code.

Each line of unsafe code is a potential bug. I tried my hardest to ensure each instance of unsafe was sound, but despite that, I could have been wrong in some situation. This is why I try to have as little unsafe code in the projects I maintain, and why I tend to prefer dependencies with little or no unsafe code.

Despite being faster than Rhai in this one particular benchmark, I would have prefered to use Rhai than pursue my LISP-like. But, instead of bucking the recurring theme of never actually working on a game, I found myself pondering, "What if instead of a LISP-like, I did something more traditional like a stack-based language? Could I make it faster?"

All of the complexity I encountered in optimizating my LISP-like came from problems I could design around. What if I purposely designed a simple language aimed at meeting my goals exactly?

That leads us to Last Thursday morning. Before I had finished my first cup of coffee, I found myself in a new project with #![forbid(unsafe_code)] plastered at the top.

Bud take 2: A Safe Stack Machine

I had a goal: create a stack-based virtual machine capable of exactly executing this recursive fibonacci and supports pausing and resuming. By the time I stopped for lunch, I had stubbed out the basics of what became Bud's virtual machine, and it was executing fibonacci(35) in less than 0.5s.

As I added more features, it slowed down, but as shown in the current list of timings, it still has respectable performance. The biggest impact was adding the ability to wrap a Rust value so that it can be used within the virtual machine. This added a requirement for Drop on the Value type, which caused major changes in performance. Thankfully after looking at Lua's generated bytecode for the fibonacci function, I realized I could be smarter in my instruction design and was able to regain most of the lost performance.

Since then, I added a compiler for its simple language. The fibonacci snippet above is valid Bud code. The compiler is designed based off of my memory of studying the Dragon book in the early 2000s. Currently Bud has a lexer, a recursive descent parser, and an abstract syntax tree. It does not currently have an optimizer.

Despite not having an optimizer, the produced code is very clean. I'm particularly satisfied with is how the above function's compiled code compares against the hand-written VM instructions example:

#	hand-written	#	compiled
0	`lte @0 2 jump 2`	0	`lte @0 2 jump 3`
1	`return 1`	1	`return 1`
		2	`jump 8`
2	`sub @0 1 stack`	3	`sub @0 1 stack`
3	`recurse-call 1 $0`	4	`recurse-call 1 $0`
4	`sub @0 2 stack`	5	`sub @0 2 stack`
5	`recurse-call 1 $1`	6	`recurse-call 1 $1`
6	`add $0 $1 $$`	7	`add $0 $1 $$`

For those wanting to understand this representation of the VM instructions, it may be helpful to know that @ refers to an argument, $ is a variable, and $$ is the return "register".

What's next?

I have a vision of what I consider the true minimum viable implementation of Bud to be, and I still am missing several core features. Rather than risk things being designed differently than I envision with an outside contributor, I would rather finish my design and hear feedback than risk rejecting a contribution because it wasn't what I was hoping for.

If this project sounds interesting to you, please let me know. For all I know, everyone else who wanted a scripting language in their Rust project is happy with existing solutions. The more people who express interest in using or contributing to a minimal, dynamically-typed, completely safe language, the quicker I will finish up the remainder of my vision and open it up for feedback and contributions.

For now, however, I've been itching to get back to working on BonsaiDb. Because the optimization work has a fair amount of work remaining, I am wanting to get a new release of BonsaiDb out. Not only have I written many features that have gone unreleased for months, I was lucky enough to have several contributions from the community as well.

I'm aiming to strike a balance between working on the optimization project and moving BonsaiDb forward.

What about Bud?

I see Bud as being related to BonsaiDb in two ways:

Building a REPL-like command-line environment to work with BonsaiDb.
Using it as a proof-of-concept implementation of "dynamic schemas" -- schemas that aren't defined at compile time, but instead are powered by logic that is stored in the database itself. The core implementation in BonsaiDb will be done through traits to allow not only Bud-written ones, but eventually also WASM and other languages.

Because it's only tangentially related to BonsaiDb and my main focus is BonsaiDb, Bud's development will continue when I need a palate cleanser between my BonsaiDb work.

If you're interested in seeing where Bud goes, I would love to hear from you, or if it's easier, just leave a star on the GitHub project. Most of my development-related writing happens on the BonsaiDb blog. Because Bud project is only tangentially related to BonsaiDb, this under-used blog is where I will end up writing about it in the future.

As always, thank you for reading about my work!