[Node.js] Availability and Zero-downtime Restarts

It might be possible for our node server has some downtime, no matter it is because server update or simply some crashs in the code. We want to minizie the downtime as much as possible.

1. In case of cluster worker crash, we want master worker fork a new worker:

const http = require('http');

const cluster = require('cluster');

const os = require('os');

if (cluster.isMaster) {

    const cpus = os.cpus().length;

    console.log(`Forking for ${cpus} CPUs`);

    for (let i = ; i < cpus; i++) {

        cluster.fork();

    }

    cluster.on('exit', (worker, code, signal) => {

        if (code !== 0 && !worker.exitedAfterDisconnect) {

            console.log(`Worker ${worker.id} crashed. Starting a new wroker`);

            cluster.fork();

        }

    })

} else {

    require('./server');

}

It is important to check 'worker.exitedAfterDisconnect' to see whether is is because crash or because we want to exit one worker.

2. In case of upgrade, we want to restart each worker one by one, to make zero downtime:

    // kill -SIGUSR2 <MASTER_PID>

    // In case to upgrade, we want to restart each worker one by one

    process.on('SIGUSR2', () => {

        const workers = Object.values(cluster.workers);

        const restartWorker = (workerIndex) => {

            const worker = cluster.workers[workerIndex];

            if (!worker) return;

            // On worker exit, we want to restart it, then continue

            // with next worker

            worker.on('exit', () => {

                // If it is because crash, we don't continue

                if (!worker.exitedAfterDisconnect) return;

                console.log(`Exited process ${worker.process.pid}`);

                cluster.fork().on('listening', () => {

                    restartWorker(workerIndex + );

                });

                worker.disconnect();

            });

        }

        // Calling restartWorker recursively

        restartWorker();

    });

In really production, we don't actually need to code cluster by ourselve, we can use PM2 package. but it is important to understand what's happening under hood.

---

const cluster = require('cluster');

const http = require('http');

const os = require('os');

// For runing for the first time,

// Master worker will get started

// Then we can fork our new workers

if (cluster.isMaster) {

    const cpus = os.cpus().length;

    console.log(`Forking for ${cpus} CPUs`);

    for (let i = ; i < cpus; i++) {

        cluster.fork();

    }

    // In case of crash, we want to strat a new worker

    cluster.on('exit', (worker, code, signal) => {

        if (code !==  && !worker.exitedAfterDisconnect) {

            console.log(`Worker ${worker.id} crashed. Starting a new wroker`);

            cluster.fork();

        }

    })

    // kill -SIGUSR2 <MASTER_PID>

    // In case to upgrade, we want to restart each worker one by one

    process.on('SIGUSR2', () => {

        const workers = Object.values(cluster.workers);

        const restartWorker = (workerIndex) => {

            const worker = cluster.workers[workerIndex];

            if (!worker) return;

            // On worker exit, we want to restart it, then continue

            // with next worker

            worker.on('exit', () => {

                // If it is because crash, we don't continue

                if (!worker.exitedAfterDisconnect) return;

                console.log(`Exited process ${worker.process.pid}`);

                cluster.fork().on('listening', () => {

                    restartWorker(workerIndex + );

                });

                worker.disconnect();

            });

        }

        // Calling restartWorker recursively

        restartWorker();

    });

} else {

    require('./server');

}

[Node.js] Availability and Zero-downtime Restarts的更多相关文章

[转]Getting Start With Node.JS Tools For Visual Studio
本文转自:http://www.c-sharpcorner.com/UploadFile/g_arora/getting-started-with-node-js-tools-for-visual-s ...
Understanding Asynchronous IO With Python 3.4's Asyncio And Node.js
[转自]http://sahandsaba.com/understanding-asyncio-node-js-python-3-4.html Introduction I spent this su ...
[Server Running] [Node.js, PM2] Using PM2 To Keep Your Node Apps Alive
PM2 is a production process manager for Node.js applications with a built-in load balancer. It allow ...
Node.js(day5)
一.NOSQL NOSQL是Not Only SQL的简称,与关系型数据库对应,一般称为非关系型数据库.关系型数据库遵循ACID规则,而NOSQL存储数据时不需要严格遵循固定的模式,因此在大数据的今天 ...
Four Node.js Gotchas that Operations Teams Should Know about
There is no doubt that Node.js is one of the fastest growing platforms today. It can be found at sta ...
Node.js 操作Mongodb
Node.js 操作Mongodb1.简介官网英文文档 https://docs.mongodb.com/manual/ 这里几乎什么都有了MongoDB is open-source docum ...
node.js使用cluster实现多进程
首先郑重声明: nodeJS 是一门单线程!异步!非阻塞语言! nodeJS 是一门单线程!异步!非阻塞语言! nodeJS 是一门单线程!异步!非阻塞语言! 重要的事情说3遍. 因为nodeJS天生 ...
node.js学习（三）简单的node程序&&模块简单使用&&commonJS规范&&深入理解模块原理
一.一个简单的node程序 1.新建一个txt文件 2.修改后缀修改之后会弹出这个,点击"是" 3.运行test.js 源文件使用node.js运行之后的. 如果该路径下没有该 ...
利用Node.js的Net模块实现一个命令行多人聊天室
1.net模块基本API 要使用Node.js的net模块实现一个命令行聊天室,就必须先了解NET模块的API使用.NET模块API分为两大类:Server和Socket类.工厂方法. Server类 ...

随机推荐

android 视频 2017
韩梦飞沙韩亚飞 313134555@qq.com yue31313 han_meng_fei_sha
【漏洞预警】Intel爆CPU设计问题，导致win和Linux内核重设计（附测试poc）
目前研究人员正抓紧检查 Linux 内核的安全问题,与此同时,微软也预计将在本月补丁日公开介绍 Windows 操作系统的相关变更. 而 Linux 和 Windows 系统的这些更新势必会对 Int ...
Codeforces Round #351 (VK Cup 2016 Round 3, Div. 2 Edition) C. Bear and Colors 暴力
C. Bear and Colors 题目连接: http://www.codeforces.com/contest/673/problem/C Description Bear Limak has ...
HDU 1269 移动城堡联通分量 Tarjan
迷宫城堡 Time Limit: 2000/1000 MS (Java/Others) Memory Limit: 65536/32768 K (Java/Others)Total Submis ...
Xcode 中的IOS工程模板
1.IOS模板主要分为: Application .Framework.Other application 分为:Master-Detail Application 可以构建树形导航模式引用,生成的代 ...
centos7安装maven
下载maven 下载地址:http://mirrors.tuna.tsinghua.edu.cn/apache/maven/maven-3/3.3.9/binaries/apache-maven-3. ...
spring---aop（7）---Spring AOP中expose-proxy介绍
写在前面 expose-proxy.为是否暴露当前代理对象为ThreadLocal模式. SpringAOP对于最外层的函数只拦截public方法,不拦截protected和private方法(后续讲 ...
document.all理解
The all collection includes one element object for each valid HTML tag. If a valid tag has a matchin ...
[Node.js]NET模块
摘要 net模块提供了一些用于底层的网络通信的小工具,包含了创建服务器和客户端的方法.可以使用该模块模拟请求等操作. net模块引入net模块 var net=require("net&q ...
安装wp8sdk 当前系统时钟或签名文件中的时间戳验证时要求的证书不在有效期内。
安装wp8sdk 当前系统时钟或签名文件中的时间戳验证时要求的证书不在有效期内. [1404:0090][2015-06-12T08:00:53]: Error 0x800b0101: Failed ...

[Node.js] Availability and Zero-downtime Restarts

[Node.js] Availability and Zero-downtime Restarts的更多相关文章

随机推荐

热门专题