The Fox and the Bolt: Bringing WebContainers to Firefox

Two days ago we announced that WebContainers are now compatible with Firefox. In fact, we soft-launched the alpha version a few weeks ago! We’ve been working hard for the past months to make this a reality, and we can honestly say that has been both trying and rewarding. Seeing a full Node runtime run on a new browser was well worth the effort, but it’s had us toiling down some pretty deep rabbit holes.

You see, WebContainers are not your run-of-the-mill web application. We don’t render any UI, nor do we have any CSS code. We don’t run into the “usual” cross-browser problems with rendering differences. Instead, we make heavy use of message passing, WebAssembly and atomics. We need to pay attention to the precise wording of Web specs and Node’s documentation, so we get the behavior right.

One big roadblock for Firefox support was cross-origin isolation, and we’ve already talked at length about the challenges that entails. Here we’re going to discuss everything else. Everything you need to know if you ever want to port a JS runtime to run on top of a different one.

The elephant in the room

We’ve had an open ticket in our internal tracker for a long time with a pretty dry description:

Error.prepareStackTrace

To provide more context, the V8 JavaScript runtime has a custom stack traces API that gives you some pretty powerful introspection capabilities. For instance:

It allows you to generate a stack trace omitting specific callers.
It allows you to inspect the stack as structured data.
It allows you to use said structured data to format the stack trace in any fashion.

The stack traces API is naturally part of Node’s API, at least implicitly, and it’s used widely both in the Node ecosystem and by Node itself in its internal code. When WebContainers were initially developed, we were fully aware that we were acquiring some “technical debt” of sorts: by focusing first on Chrome, also based on V8, we got support for this API for free. On the other hand, the aforementioned ticket 🙈

The solution for this is neither easy nor complete for a variety of reasons. For starters, the stack field in the Error instances is highly vendor-dependent (it’s not even part of the JS spec!). We’ve developed a series of polyfills for these APIs that partially fill the gap, making use of whatever information we can extract from Firefox’s stack traces.

On this note: we understand that there might be good reasons why Error introspection is not part of the standard. It exposes lots of details about the engine, including possible optimizations. However, any progress in this direction would be very beneficial for advanced use cases like ours, so we’ll pay close attention to whatever comes up in TC39.

Sometimes it’s just a bug

One good thing about the complexity of WebContainers is that they end up being quite the stress test for browser engines. In the course of porting it to Firefox, we’ve uncovered a variety of bugs that, apparently, nobody had noticed before!^[1]

All of them are related to MessageChanel and/or Atomics. Luckily for us, they were easy to work around, so their impact is actually quite minimal. However, they make for a few fun hours of debugging weird race conditions, all to find out that it was not “our fault”.

In any case, we dutifully reported them and got great responses from the Firefox team. The details of each bug are sometimes very involved (as our setting is!) but here they are, for your own amusement:

Don’t blame the player!

To be fair, we uncovered a few mistakes of ours when porting to Firefox as well! At the end of those debugging sessions, sometimes there was a particular race condition that, for whatever reason, never manifested on Chrome but was there anyway. So porting to Firefox helped us fix a few bugs. That’s what multiple independent implementations are for! 🎉

In addition to our own mistakes, we sometimes encountered problems with other implementations when compared to Firefox. For instance, we found a bug in Node itself, where a MessageChannel does send a message through a closed port (#42296). This is a tricky one because the semantics of MessageChannel and worker_threads in Node don’t necessarily align with Web Workers,^[2] but in this case it does seem it was an unintentional deviation. Kudos for the quick fix!

Another example is the User-Agent header: Firefox does allow you to set this header in fetch() calls, an addition to the spec that is 7 years old (!), while Chrome does not yet allow for that. This matters to us because custom user-agents are more likely to trigger CORS errors.

Sometimes it’s nobody’s fault

There’s been a handful of (sad) cases where we couldn’t fix the underlying issue or we just wouldn’t. For instance, we’ve encountered (at least twice!) code in the wild that would sort an array doing something akin to this:

things.sort((aThing, bThing) => {
  return someCustomLogic(aThing, bThing) ? -1 : 0;
});

This is wrong! The comparison function we pass to sort has to be consistent, meaning that if we call it with (a, b) and get -1, calling it with (b, a) should return +1, which is not what will happen here. This code works in the wild because V8 happens to compare elements in a certain order that makes it work, but this is highly dependent on its internal implementation. Even if you don’t care about cross-engine compatibility, a future optimization might break your code!^[3]

A similar example are usages in the wild of non-standard date-time strings:

new Date('2022-04-13Z')

The Date constructor must correctly recognize (simplified) ISO 8601 date strings (YYYY-MM-DDTHH:mm:ss.sssZ). But each implementation might choose to parse additional date-time string formats. V8 correctly parses the above example, while SpiderMonkey does not. In this case we did choose to polyfill this behavior.

Another thorny issue we are still dealing with is task scheduling. When a browser API defers something to be called “later” (say, messages, promises, timers, etc.) it has some freedom on how to schedule things. We do know that microtasks (e.g. promise callbacks) will get called before tasks (e.g. most of the other stuff that we mentioned before). But when a task is queued, the browser is free to schedule it according to its nature in any way it wants. Quoting from the spec:

Let taskQueue be one of the event loop’s task queues, chosen in an implementation-defined manner […]

This means, for instance, that this code works differently between browsers (the order of the messages is different):

const channel = new MessageChannel();

channel.port1.onmessage = () => console.log('message');

setTimeout(() => console.log('timeout'));

channel.port2.postMessage(null);

which is totally fine, since the “ports event queue” and the timers queue are different. Our event loop implementation is not yet able to work around this issue and some observable consequences are expected. It is in alpha, after all!

Testing, blood, and tears

There might be a certain irony in the fact that the most painful part of this project was not untangling these compatibility issues, but writing tests for them. Because of their delicate nature, WebContainers are mostly tested in an end-to-end fashion, by spawning a real browser and observing its output. We cannot afford mocking the underlying platform: it’s a full-fledged browser!

Anybody that deals with browser end-to-end testing knows what we’re about to say next: it’s 2022 and there’s no standardized and dev-friendly way to write cross-browser tests. WebDriver is kind of outdated/old-school. The Chrome DevTools Protocol (the technology that supports Puppeteer) allows for more flexibility but, as its name gives away, it is Chrome-only. Actually, we currently use Puppeteer, which works on Firefox thanks to the herculean efforts of the Mozilla team. But their support for this protocol is incomplete and its interaction with cross-site isolation makes it a tad fragile.

We’ve heard the chatter about WebDriver BiDi, the next-generation standard that is supposed to supersede the current status quo. We eagerly look forward to any progress in this area 🤞🤞

We need your feedback

We’re very excited to finally unveil this effort, and we can’t wait for you all to try it out. As we’ve detailed at length, we expect it to have a few rough edges. You can check our documentation about browser support, and we’ll appreciate any bug reports sent our way.

This work has significantly improved our understanding of what we can achieve and what we need for better cross-browser compatibility. We are now in a better position to start subsequent efforts, such as supporting mobile browsers.^[4] We also have a better grasp on some of the fundamental pieces that WebContainers need to function, so we’ll keep working to bring this technology everywhere we can.

[1] To be fair, some of those bugs were already reported. But Gecko is a pretty large piece of software, and we weren’t able to relate those existing reports with the symptoms we were observing.
[2] Note that the worker_threads implementation of Node does not share code whatsoever with Chrome’s Worker implementation - as is the case with any other browser for that matter.
[3] As far as we know, V8 does not provide any extra guarantees about its sorting implementation beyond what is in the EcmaScript spec.
[4] Though, one final surprise: Firefox mobile works out of the box!

The Fox and the Bolt: Bringing WebContainers to Firefox

The elephant in the room

Sometimes it’s just a bug

Don’t blame the player!

Sometimes it’s nobody’s fault

Testing, blood, and tears

We need your feedback

Explore more from StackBlitz

Subscribe to StackBlitz Updates

Using StackBlitz at work?

The Fox and the Bolt: Bringing WebContainers to Firefox

The elephant in the room

Sometimes it’s just a bug

Don’t blame the player!

Sometimes it’s nobody’s fault

Testing, blood, and tears

We need your feedback

More posts by Roberto

Related Posts

Explore more from StackBlitz

Subscribe to StackBlitz Updates

Using StackBlitz at work?