WASM is cool; I've started implemented a CPU that runs unmodified WASM in Verilog, but I'm finding the feature creep on the instruction set (SIMD, GC) to take away from the initial values behind WASM (simple, small)
You can ignore SIMD and GC (for now). SIMD explodes the complexity level of Wasm, esp when there is WebGPU. I am curious how you are handling layout and how you are handling all the irregular sizes.
Oh, I don't think so either, but if you think back to the asm.js times, there was a clear goal of "simple and higher perf", but now it's going in a direction for maximum compatibility with existing stacks (GC, WASI, etc) at "any" cost