Skip to main content

Sending data between Worker Threads

The Worker Threads give us quite exciting features when compared to, for example, child processes. In this article, we create our instances of the MessageChannel to communicate with the Worker Threads. Aside from that, we also share the data between them: it’s something that is not possible with child processes.

Communicating using the MessageChannel

There are a few ways to communicate with Node.js Worker Threads. In the previous part of this series, aside from passing the data through workerData, we’ve used the parentPort to send the data from the worker thread to the parent.

worker.ts
import { parentPort, workerData } from 'worker_threads';

function factorial(n: number): number {
if(n === 1 || n === 0){
return 1;
}
return factorial(n - 1) * n;
}

parentPort.postMessage(
factorial(workerData.value)
);
main.ts
import { Worker } from 'worker_threads';

const worker = new Worker('./worker.js', {
workerData: {
value: 15,
path: './worker.ts'
}
});

worker.on('message', (result) => {
console.log(result);
});

In the example above we pass two different paths, both to worker.js and worker.ts. This is connected to the fact that we can’t as of now create worker threads purely from TypeScript files. For more information check out the previous part of this series.

While this is a perfectly fine solution, there is more to it. We can create our ports if we find ourselves in a situation requiring a more complex solution.

Creating a new MessageChannel

A surprising fact is that the MessageChannel does not have any methods on its own. The only two properties it has are port1 and port2, both instances of MessagePort.

import { MessageChannel } from 'worker_threads';

const { port1, port2 } = new MessageChannel();

Each one of the ports represents one end of the communication channel. Since the parentPort is also an instance of the MessagePort, we can guess that our new ports also have the postMessage function.

import { MessageChannel } from 'worker_threads';

const { port1, port2 } = new MessageChannel();

port1.postMessage('Hello world!');

port2.on('message', (message) => {
console.log(message);
});

The MessageChannel is capable of working without involving any Worker Threads, as in the example above. To fully understand their purpose, let’s use them with a worker.

Sending MessagePort through a transferList

To set up such communication, we need to send one of the ports to the other thread, while listening to messages on the other end.

main.ts
import { Worker, MessageChannel } from 'worker_threads';
const { port1, port2 } = new MessageChannel();

const worker = new Worker('./worker.js', {
workerData: {
path: './worker.ts'
}
});

port1.on('message', (result) => {
console.log(result);
});

worker.postMessage({ port: port2, value: 15 });
worker.ts
import { parentPort } from 'worker_threads';
import Data from './Data';

function factorial(n: number): number {
if(n === 1 || n === 0){
return 1;
}
return factorial(n - 1) * n;
}

parentPort.on('message', (data: Data) => {
const { port } = data;
port.postMessage(factorial(data.value));
});
Data.ts
import { MessagePort } from 'worker_threads';

export default interface Data {
port: MessagePort;
value: number;
}

Well, almost there! Running the above code results in an error:

MessagePort was found in message but not listed in transferList

Different ways of sending the data

The transferList that the error mentions is an additional argument of the postMessage function. The above is connected to how the postMessage function works. By default, it creates a clone of the data that we send, but we can change that behavior with the transferList. This is enforced if we use MessagePort. Let’s improve the example from above:

Sending MessagePort through transferList

main.ts
import { Worker, MessageChannel } from 'worker_threads';
const { port1, port2 } = new MessageChannel();

const worker = new Worker('./worker.js', {
workerData: {
path: './worker.ts'
}
});

port1.on('message', (result) => {
console.log(result);
});

worker.postMessage({ port: port2, value: 15 }, [port2]);

Now we explicitly state that we don’t want a clone of the port2. Thanks to that, we pass an original object to the worker.

worker.postMessage({ port: port2, value: 15 }, [port2]);

The MessageChannel might prove to be useful with more complex algorithms where we create multiple threads and would like to message the main thread from one of our nested Worker Threads.

Sending ArrayBuffers

In the previous examples, we send elementary data like a number. Nothing restraints us from sending more complex data structures.

const numbers = [ 11, 12, 13, 14, 15 ];

worker.postMessage({ port: port2, value: numbers }, [port2]);

It works without issues, but in the example above the numbers, array is cloned. The more complex the data structure, the more computing power it takes to clone it.

We can prevent that from happening. To do that, we need to use the ArrayBuffer that differs a bit from the Buffer that we cover in the third part of this series. One of the easiest ways to create it is to use an Uint8Array that represents an array of 8-bit unsigned integers.

const array = new Uint8Array([ 11, 12, 13, 14, 15 ]);

worker.postMessage({ port: port2, value: array }, [port2, array.buffer]);
worker.ts
parentPort.on('message', (data: Data) => {
const { port, value } = data;
value.forEach((number => {
port.postMessage(factorial(number));
}))
});

As you can see, the ArrayBuffer buffer that we need is in the array.buffer property. Since we operate on bytes here, we need to be aware of the restraints. Let’s look at how the number 256 is represented in binary: binary.png

2⁸ = 256

When we look at the example above we can figure out that the number 256 does not fit into eight bit.Uint8Array ignores excess data and due to that, 256 equals 0. You can use Uint16Array instead, but keep in mind that every element of that array uses more memory.

A crucial thing is that when we transfer the buffer and append it to the transferList, it is no longer available to the sender.

const array = new Uint8Array([ 11, 12, 13, 14, 15 ]);

worker.postMessage({ port: port2, value: array }, [port2, array.buffer]);

console.log(array.buffer);

//> TypeError: Cannot perform Construct on a neutered ArrayBuffer

The above is a mechanism implemented to guarantee no race conditions between 2 different threads trying to access the same chunk of memory. It is a fundamental issue with multithreading that is tackled in multiple ways in the multithreading world – for example with locks.

Sharing the data using SharedArrayBuffer

The above restraint, while preventing us from destroying our data, has a workaround. When using Worker Threads, we can designate a piece of memory accessible for more than one thread. To do that, we need to create Uint8Array using a special type of a buffer called SharedArrayBuffer.

const sharedArrayBuffer = new SharedArrayBuffer(2);
const array = new Uint8Array(sharedArrayBuffer);
array[0] = 11;
array[1] = 12;

worker.postMessage({ port: port2, value: array }, [port2]);

This way the array is neither cloned, nor it is unavailable to the sender. We need to be especially careful though we stripped ourselves from the built-in mechanism that keeps the integrity of the data. You risk changing a value in one thread and expecting it to be something else in another thread.

Sharing arrays can come in handy, but in TypeScript, we use lots of objects. Currently, ECMA Script does not contain shared objects, but we can imitate similar behavior using DataView.

Summary

In this article, we dived deeper into Node.js Worker Threads, focusing on how can we communicate between Worker Threads and send data. We used MessageChannel to create a two-way communication channel between Node.js Worker Threads. Aside from that, we implemented different approaches to sending data between threads. It included cloning, sharing the data between multiple threads and restricting the access to just one thread through the transferList.