Three Undocumented Projects And The Three Month Hiatus


>> The WebAssembly Backend Rewrite + Year 11 Physics, Applied + me.l-m.dev

Posted on | 1780 words | ~9 minute read


You know how long ago this was?

Three months too long.

A lot has happend that’s gone completely undocumented, I am here to clear that up.

The V WebAssembly Compiler Backend

Feb 26, 2023 | 1615 words | ~8 minute read
[ V ] [ Compiler Theory ] [ WebAssembly ]

That post was created at the start of the school year. I am now halfway through the year and on break.

During that time I have been working on three large projects.

Excuse for not documenting them on the blog? Pure laziness.

Not Exactly

The real reason was a combination of school and developer productivity. I don’t have much time to work on projects, so I ship them as fast as I can during the time where I can.

I work pretty fast, and use up the time I have, so the blog then gets ignored.

Decent excuse? Forgive me? You’re in luck, I have a lot in store for you.

The V WebAssembly Backend Rewrite

This I am proud of.

I’ll break it down for you.

  1. The WebAssembly backend depends on Binaryen. Binaryen is an absolute nightmare to package and distribute to users of V. The issue was the C++ dependancy, not everyone has the latest version of libstdc++, especially people using LTS Linux distributions and Windows.

  2. Code Generation was a mess. Not all of it, I’ll give myself credit there, just the code dealing with memory locations and stack frames. It was the result of wishful premature optimisation, and I have found and implemented a better solution.

  3. The ability to generate WebAssembly code should not be limited to the compiler only. Expose the ability to generate WebAssembly code to the standard library! I was a little annoyed that the native backend didn’t do this, I would have loved to build in-memory JIT compilers. import wasm is a MUST.

Solution?

  1. Read through the entire WebAssembly binary specification, implement a nice API to generate WebAssembly modules from V.

  2. Rip out Binayen and rewrite Code Generation. I saw myself rewriting a very large chunk of it, cleaning up the code along the way.

  3. Finish?

I’ll get to number 3 at the end.

This is import wasm, it is in the V standard library.

It’s a full implementation of the WebAssembly binary format MVP.

The library is designed as a pure V alternative to Binaryen.

$ wasm-validate num.wasm; echo $?
0

It’s fairly complex, whilst it’s API is simple.

import wasm
import os

mut m := wasm.Module{}
mut f := m.new_function('num', [], [.i32_t])
{
	f.i32_const(10) // | i32.const 10
	f.i32_const(15) // | i32.const 15
	f.add(.i32_t)   // | i32.add
}
m.commit(f, true) // export: true
os.write_file_array('num.wasm', m.compile())!

Using the interface, you can easily write compilers that generate WebAssembly on demand.

This example generates a binary importing functions from the WASI namespace, then calling them.

$ cd v
$ v run examples/wasm_codegen/hello_wasi.v \
	| wasmer run /dev/stdin
Hello, WASI!

I won’t get too far into this, you can read the docs yourself.

Another reason why I won’t go further will be explained shortly.

The Backend Rewrite.

When I was initially writing the WebAssembly backend, it was my introduction to code generation for a compiler as a WHOLE. I personally did not have as much experience back then as I did now.

To understand the structure of a typical compiler backend, I combed through the source code of V’s native ARM and x86_64 backend. This gave me a lot of pointers, one being the inspiration for the representation of a memory location.

// variables, very VERY similar to the native backend
type Var = Global | Stack | Temporary | ast.Ident

// representing a memory location, pointer or variable
// this is the primary interface
type LocalOrPointer = Var | binaryen.Expression

A common pattern inside the native backend would be to take an ast.Ident from an expression, and convert it to one variant of a Global | Stack | ... sumtype.

WebAssembly has a notion of locals, think of them as an infinite amount of CPU registers you can use freely.

Unlike CPU registers, WebAssembly locals aren’t volitile at all. They don’t get overwritten. I am not taking full advantage of this with that design.

How well this works for native is debatable, but for WebAssembly we can do better and simpler.

struct Var {
	name       string   // if applicable, used in debuginfo
mut:
	typ        ast.Type
	is_address bool     // complex flag, is_not_value_type essentially
	is_global  bool
	idx        int      // wasm.LocalIndex | wasm.GlobalIndex
	offset     int      // pointer offset
}

Two major things here. The fields is_address and offset.

The is_address field is a special flag.

Var{ is_address: !g.is_pure_type(typ) }

In short, value types that can be stored in a register are pure types, with some exceptions. It all comes down to the custom ABI, and how values are passed from function to function.

Passing a struct obviously cannot fit into a WebAssembly local, you must pass a pointer into the callee’s stack frame.

The is_address flag ensures you know how to store the value if it were to be cloned, passed around, dereferenced, and so on.

// is_pure_type(voidptr) == true
// is_pure_type(&Struct) == false
fn (g Gen) is_pure_type(typ ast.Type) bool {
	if typ.is_pure_int()
		|| typ.is_pure_float()
		|| typ == ast.char_type_idx
		|| typ.is_real_pointer()
		|| typ.is_bool() {
		return true
	}
	ts := g.table.sym(typ)
	if ts.info is ast.Alias {
		ptyp := ts.info.parent_type
		return g.is_pure_type(ptyp)
	}
	return false
}

It’s much better than the Global | Stack | Temporary | binaryen.Expression distinction, that was entirely a wishful premature optimisation.


The field offset is very nice to have. Why create a new local which is just the offset of another one?

The ability to reuse existing locals is very important and avoids waste.

This fits in very well with the stack offsets from the base pointer __vbp.

struct AA { a int b int c int d int }

fn test() {
	a := AA{}
	b := AA{}
}

One WebAssembly local, three memory locations.

Var{ name: '__vbp', idx: 0             }
Var{ name: 'a<AA>', idx: 0, offset: 0  }
Var{ name: 'b<AA>', idx: 0, offset: 16 }

Where To From Here?

Read the open message and more here. (What website is this?? Keep reading.)

Want more exposition? Join the V discord and visit #wasm-backend.

The current status of the rewrite is as such. It generates invalid code for expression block statements such as If expressions, and certain parts of it’s design must be changed.

For a month or so back then, I’ve been stringing the V community along with updates. There has been less and less updates due to school and such, so I’ve halted progress on the rewrite until now.

It’s time to show something new.

Year 11 Physics, Applied. [+]

A collection of interactive demonstrations and physics simulations.

I love Physics, there isn’t anything like it. My obession is probably indicative by all of these simulations I’ve created over the years.

Remember that softbody one I did?

That’s one of them. Not even including the tons of toy Rasterisers, Ray Tracers, Ray Marchers, and Physics engines.

Welp, I take Physics in school. Why not do something nice to demonstrate what I learn in the curriculum?

I write every single Physics demo in C, the using Emscripten to compile that straight to WebAssembly.

For graphics I use Sokol and ImGui, they have bindings for C and can be compiled to WebAssembly using WebGL.

Calling make compiles the entire site into a self contained collection of HTML and WebAssembly files.

Full source -> l1mey112/physics-applied

Want to read small notes about my thought process? (Again, same website. Keep reading.)

me.l-m.dev [+]

This was a funny one.

I built an entire full stack server side rendered linear blogging application entirely in V over the long weekend, whilst I was still supposed to be on holiday. I was committing code whilst half asleep on the couch in some beach accomodation.

I’ve always wanted something like this.

The Story

For the past couple years, I’ve had a Discord channel in a shitty server where I would post things daily. Anything that interested me at the time, or what I was working on. The first post there was a 3D model supposed to be used in a portfolio website using ThreeJS and JavaScript. This was back when I was learning on my own to become a full stack JavaScript developer (yikes).

Posts after that? 3D models, animation, the early stas compiler, music, V, and etc.

I took the time to insert the posts into a sqlite3 database, and got on with my day. I also scraped music and personal anecdotes and shoved them in there too.

The site currently has 627 posts, and 235 unique tags.

It allows me to add new posts, edit existing posts, delete posts, and backup all from the website itself.

Special Requirements

This is what I wanted, and achieved.

The website has to respect privacy, and so to give users a peace of mind, that means no JavaScript.

The website also has to be able to embed outside content, such as music from Spotify and videos from YouTube.

  1. Privacy Respecting
  2. Zero JavaScript
  3. Dynamic And Easy To Work With
  4. Embedding Outside Content
  5. RSS

The website can spot YouTube and Spotify urls and replace them with the bare content, stripping JavaScript.

How does it spot and generate proper Spotify embeds without JavaScript?

Simple, web scraping.

  1. Import regex and use query https?://open\.spotify\.com/track/(\S+) to access the URL and Track ID.

  2. Make a HTTP GET request to the URL.

  3. On the HTML response, run this regex query on it:

    <script\s+id="initial-state"\s+type="text/plain">([^<]+)</script>

  4. Using the captured text, which is a Base64 encoded JSON string, decode it.

  5. Extract all the metadata you need, including the 30 second preview MP3, cover art, and etc.

  6. Embed safe HTML, without all the JavaScript on a normal <iframe> embed.

Simple, right? It’s cool, I like it.

The source code is completely open, and so is the website!

The End.

Three large projects, all documented here all at once.

These projects aren’t just one offs, they’ll become long projects I’ll be working on.

  1. The WebAssembly Backend Rewrite will be completed by the end of my break.

  2. I expect to add some more simulations, don’t want to miss out on optics content. I also expect to create some small blog posts around improving the site.

  3. me.l-m.dev is in dire need of some features right now, specifically pagination. I have a vision on what I needs to get done, and how I will do it.

Until then, Goodbye!