Three JavaScript performance fundamentals that make Bluebird fast

As promised in our recent Promises made by a Reaktor developer had an impact on the industry article, here’s some original knowledge from our very own Petka Antonov – programmer and creator of the acclaimed Bluebird promise library.

Bluebird is a widely used promise library for JavaScript which initially got noticed back in 2013 due to it being up to 100 times faster than other promise implementations with similar feature sets at the time. What makes Bluebird so fast is the consistent application of some JavaScript optimization fundamentals throughout the library. This article will go over in detail 3 of the most valuable fundamentals that were used to optimize Bluebird.

1. Minimize function object allocation

Object allocations, particularly function object allocations due to the heavy amount of internal data needed to implement them, can be very taxing to performance. Practical implementations of JavaScript are garbage collected so allocated objects are not simply just sitting in memory but the garbage collector is constantly looking for unused objects so that they can be deallocated. The more memory you use in JavaScript the more CPU is being used to power the garbage collector and less CPU becomes available to run actual code.

In JavaScript functions are first class objects. This means that they have same features and properties as any other object. If you have a function which contains code declaring another function or functions, then every call to the parent function will create new unique function objects despite having the same code. A basic example of this would be:

function trim(string) {
    function trimStart(string) {
        return string.replace(/^\s+/g, "");
    }

    function trimEnd(string) {
        return string.replace(/\s+$/g, "");
    }

    return trimEnd(trimStart(string))
}

Now every time the trim function is called, two unnecessary function objects are created to represent the functions trimStart and trimEnd. The function objects are unnecessary because they are not being used for their unique object identities such as property assignment or closed over variables. They are only used for the functionality of the code that they contain.

This particular example is very easy to optimize, the functions can simply be moved outside of trim. As the example is contained in a module and the module is only loaded once for the program, only one representation will exist for the functions:

function trimStart(string) {
    return string.replace(/^\s+/g, "");
}

function trimEnd(string) {
    return string.replace(/\s+$/g, "");
}

function trim(string) {
    return trimEnd(trimStart(string))
}

However, more commonly function objects seem like a necessary evil and cannot be optimized out so trivially. For instance, anytime you are passing a callback function somewhere to be called later, there is virtually always a need for a unique context for that particular callback. Typically such context is implemented in an easy and intuitive but inefficient way using closures. A simple example of this can be reading a file as JSON in node using the default asynchronous callback interface:

var fs = require('fs');

function readFileAsJson(fileName, callback) {
    fs.readFile(fileName, 'utf8', function(error, result) {
        // This is a new function object created every time readFileAsJson is called
        // Since it's a closure, an internal Context object is also 
        // allocated for the closure state
        if (error) {
            return callback(error);
        }
        // The try-catch block is needed to handle a possible syntax error from invalid JSON
        try {
            var json = JSON.parse(result);
            callback(null, json);
        } catch (e) {
            callback(e);
        }
    })
}

Here the callback passed to fs.readFile cannot be moved out of readFileAsJson as it is creating a closure over the unique variable callback. It should also be noted that making the fs.readFile callback a named function declaration instead of an inline anonymous function will not make any difference.

An optimization that is used internally in Bluebird to a great extent is the usage of an explicit plain object to hold contextual data. An operation that consists of callback passing through multiple layers will only require one allocation of such an object. Instead of each layer creating a new closure every time a callback is passed to another layer, the explicit plain object is passed as an extra argument. For example, if there are 5 callback steps in an operation, using closures would mean allocating 5 function objects and Context objects whereas only 1 plain object would be allocated in total when using the explicit plain object optimization.

If the fs.readFile API could be modified to accept a context object, applying the optimization on the example would look like this:

var fs = require('fs-modified');

function internalReadFileCallback(error, result) {
    // The modified readFile calls the callback with the context object set to `this`,
    // which is just the original client's callback function
    if (error) {
        return this(error);
    }
    // The try-catch block is needed to handle a possible syntax error from invalid JSON
    try {
        var json = JSON.parse(result);
        this(null, json);
    } catch (e) {
        this(e);
    }
}

function readFileAsJson(fileName, callback) {
    // The modified fs.readFile would take the context object as 4th argument.
    // There is no need to create a separate plain object to contain `callback` so it
    // can just be made the context object directly.
    fs.readFile(fileName, 'utf8', internalReadFileCallback, callback);
}

As is evident, you need to control both ends of an API which makes this optimization unusable with APIs that don’t take context object parameters. However, when you can utilize it (e.g. when you control multiple internal layers) the performance gains are substantial. A little known fact: some built-in JavaScript Array APIs, such as Array.prototype.forEach, take a context object argument as the second parameter.

2. Minimize object size

It is crucial to minimize the size of the objects that are allocated often and in great quantities such as promises. The heap where objects are allocated in the most used JavaScript implementations is divided into segments and spaces. Smaller objects take longer to fill the spaces and segments than larger objects thus creating less work for the garbage collector. Smaller objects also generally have less fields for the garbage collector to visit when determining live and dead objects.

Boolean and/or restricted integer fields can be packed into a much smaller space by utilizing bitwise operators. JavaScript bitwise operators operate on 32-bit integers so you can for example fit 32 boolean fields or 8 4-bit integers or 16 booleans and 2 8-bit integers etc. into just one field. To keep the code readable, each logical field should have a getter and setter function pair that performs the proper bitwise operations on the physical field. Example of 1 boolean field packed into an integer (which can be expanded to contain more logical fields in the future) might look like this:

// Use 1 << 1 for the second bit, 1 << 2 for the third bit etc.
const READONLY = 1 << 0;

class File {
    constructor() {
        this._bitField = 0;
    }

    isReadOnly() {
        // Parentheses are required.
        return (this._bitField & READONLY) !== 0;
    }

    setReadOnly() {
        this._bitField = this._bitField | READONLY;
    }

    unsetReadOnly() {
        this._bitField = this._bitField & (~READONLY);
    }
}

The accessor methods are so short that it is very likely that they will be inlined at runtime so there is no function call overhead involved.

Two or more fields that are never used at the same time can be compressed into just one field when using a boolean to track which type of value the field holds. However, this will only result in space savings if the boolean field is implemented as a packed integer field as described in the previous trick.

In Bluebird this trick is applied when storing the fulfillment value or rejection reason of a promise. There is no explicit field for it: if the promise is fulfilled then the fulfillment value can be stored in the rejection callback field, and if the promise is rejected then the rejection reason can be stored in the fulfillment callback field. Again, all accesses should be made through accessor functions that hide the ugly details of this optimization.

If an object requires a list of items you might be able to avoid a separate array allocation by simply storing the values directly on the object’s indexed properties.

So instead of:

class EventEmitter {
    constructor() {
        this.listeners = [];
    }

    addListener(fn) {
        this.listeners.push(fn);
    }
}

You can avoid the array:

class EventEmitter {
    constructor() {
        this.length = 0;
    }

    addListener(fn) {
        var index = this.length;
        this.length++;
        this[index] = fn;
    }
}

If the .length field can be restricted to a small integer (e.g. 10-bit which would limit the event emitter to a maximum of 1024 listeners) it can be made into a part of a packed bit field along with other restricted integers or booleans.

3. Use no-op functions and overwrite them lazily to implement costly optional features

Bluebird has several optional features that cause a uniform performance loss throughout the library when used. Such features are warnings, long stack traces, cancellation, Promise.prototype.bind and promise state monitoring. These features require calls to hook functions throughout the library. For example, the promise monitoring feature needs a function to be called every time a promise is created.

Checking if the monitoring feature is enabled before making the call to the hook function would be better than just making the call regardless of if the feature is enabled. However, the cost can actually be completely eliminated for users that have not enabled the feature thanks to inline caches and function inlining. This can be done by making the initial hook method just an empty function:

class Promise {

    // ...

    constructor(executor) {
        // ...
        this._promiseCreatedHook();
    }

    // Just an empty no-op method.
    _promiseCreatedHook() {}

}

Now, if the user doesn’t enable the monitoring feature, the optimizer sees that the function call doesn’t do anything and eliminates it. So the call to the hook method inside the constructor effectively doesn’t exist.

To make the feature actually work, the act of enabling the feature must overwrite all the related no-op functions with the actual implementations:

function enableMonitoringFeature() {
    Promise.prototype._promiseCreatedHook = function() {
        // Actual implementation here
    };

    // ...
}

Overwriting the method like this will invalidate any inline caches built for objects of the Promise class so this should only be done at the startup of the application before any promise objects are created. If the overwriting happens before any usage, the inline caches that will be built from usage after the feature is enabled will not realize the no-op functions even existed.

Recommended posts