本文是对MongoDB副本集常用操作的一个汇总,同时也穿插着介绍了操作背后的原理及注意点。

结合之前的文章:MongoDB副本集的搭建,大家可以在较短的时间内熟悉MongoDB的搭建和管理。

下面的操作主要分为两个部分:

1. 修改节点状态

主要包括:

1> 将Primary节点降级为Secondary节点

2> 冻结Secondary节点

3> 强制Secondary节点进入维护模式

2. 修改副本集的配置

1> 添加节点

2> 删除节点

3> 将Secondary节点设置为延迟备份节点

4> 将Secondary节点设置为隐藏节点

5> 替换当前的副本集成员

6> 设置副本集节点的优先级

7> 阻止Secondary节点升级为Primary节点

8> 如何设置没有投票权的Secondary节点

9> 禁用chainingAllowed

10> 为Secondary节点显式指定复制源

11> 禁止Secondary节点创建索引

首先查看MongoDB副本集支持的所有操作

> rs.help()
rs.status() { replSetGetStatus : } checks repl set status
rs.initiate() { replSetInitiate : null } initiates set with default settings
rs.initiate(cfg) { replSetInitiate : cfg } initiates set with configuration cfg
rs.conf() get the current configuration object from local.system.replset
rs.reconfig(cfg) updates the configuration of a running replica set with cfg (disconnects)
rs.add(hostportstr) add a new member to the set with default attributes (disconnects)
rs.add(membercfgobj) add a new member to the set with extra attributes (disconnects)
rs.addArb(hostportstr) add a new member which is arbiterOnly:true (disconnects)
rs.stepDown([stepdownSecs, catchUpSecs]) step down as primary (disconnects)
rs.syncFrom(hostportstr) make a secondary sync from the given member
rs.freeze(secs) make a node ineligible to become primary for the time specified
rs.remove(hostportstr) remove a host from the replica set (disconnects)
rs.slaveOk() allow queries on secondary nodes rs.printReplicationInfo() check oplog size and time range
rs.printSlaveReplicationInfo() check replica set members and replication lag
db.isMaster() check who is primary reconfiguration helpers disconnect from the database so the shell will display
an error, even if the command succeeds.

修改节点状态

将Primary节点降级为Secondary节点

myapp:PRIMARY> rs.stepDown()

这个命令会让primary降级为Secondary节点,并维持60s,如果这段时间内没有新的primary被选举出来,这个节点可以要求重新进行选举。

也可手动指定时间

myapp:PRIMARY> rs.stepDown()

在执行完该命令后,原Secondary node3:27017升级为Primary。

其日志输出为:

--03T22::21.009+ I COMMAND  [conn8] Attempting to step down in response to replSetStepDown command
--03T22::25.967+ I - [conn8] end connection 127.0.0.1: ( connections now open)
--03T22::37.643+ I REPL [ReplicationExecutor] Member node3: is now in state SECONDARY
--03T22::41.123+ I REPL [replication-] Restarting oplog query due to error: InterruptedDueToReplStateChange: operat
ion was interrupted. Last fetched optime (with hash): { ts: Timestamp |, t: }[-]. Restarts remaining: --03T22::41.167+ I REPL [replication-] Scheduled new oplog query Fetcher source: node3: database: local query:
{ find: "oplog.rs", filter: { ts: { $gte: Timestamp | } }, tailable: true, oplogReplay: true, awaitData: true, maxTimeMS: , term: } query metadata: { $replData: , $ssm: { $secondaryOk: true } } active: timeout: 10000ms shutting down?: first: firstCommandScheduler: RemoteCommandRetryScheduler request: RemoteCommand -- target:node3: db:local cmd:{ find: "oplog.rs", filter: { ts: { $gte: Timestamp | } }, tailable: true, oplogReplay: true, awaitData: true, maxTimeMS: , term: } active: callbackHandle.valid: callbackHandle.cancelled: attempt: retryPolicy: RetryPolicyImpl maxAttempts: maxTimeMillis: -1ms2017--03T22::41.265+ I REPL [replication-] Choosing new sync source because our current sync source, node3:, has a
n OpTime ({ ts: Timestamp |, t: }) which is not ahead of ours ({ ts: Timestamp |, t: }), it does not have a sync source, and it's not the primary (sync source does not know the primary)2017-05-03T22:24:41.266+0800 I REPL [replication-39] Canceling oplog query because we have to choose a new sync source. Current s
ource: node3:, OpTime { ts: Timestamp |, t: - }, its sync source index:---03T22::41.266+ W REPL [rsBackgroundSync] Fetcher stopped querying remote oplog with error: InvalidSyncSource: sync
source node3: (last visible optime: { ts: Timestamp |, t: - }; config version: ; sync source index: -; primary index: -) is no longer valid2017--03T22::41.266+ I REPL [rsBackgroundSync] could not find member to sync from
--03T22::46.021+ I REPL [SyncSourceFeedback] SyncSourceFeedback error sending update to node3:: InvalidSyncSourc
e: Sync source was cleared. Was node3:--03T22::46.775+ I REPL [ReplicationExecutor] Starting an election, since we've seen no PRIMARY in the past 10000ms
--03T22::46.775+ I REPL [ReplicationExecutor] conducting a dry run election to see if we could be elected
--03T22::46.857+ I REPL [ReplicationExecutor] VoteRequester(term dry run) received a yes vote from node3:; res
ponse message: { term: , voteGranted: true, reason: "", ok: 1.0 }--03T22::46.858+ I REPL [ReplicationExecutor] dry election run succeeded, running for election
--03T22::46.891+ I REPL [ReplicationExecutor] VoteRequester(term ) received a yes vote from node3:; response me
ssage: { term: , voteGranted: true, reason: "", ok: 1.0 }--03T22::46.891+ I REPL [ReplicationExecutor] election succeeded, assuming primary role in term
--03T22::46.891+ I REPL [ReplicationExecutor] transition to PRIMARY
--03T22::46.892+ I ASIO [NetworkInterfaceASIO-Replication-] Connecting to node3:
--03T22::46.894+ I ASIO [NetworkInterfaceASIO-Replication-] Connecting to node3:
--03T22::46.894+ I ASIO [NetworkInterfaceASIO-Replication-] Successfully connected to node3:
--03T22::46.895+ I REPL [ReplicationExecutor] My optime is most up-to-date, skipping catch-up and completing transiti
on to primary.--03T22::46.895+ I ASIO [NetworkInterfaceASIO-Replication-] Successfully connected to node3:
--03T22::47.348+ I REPL [rsSync] transition to primary complete; database writes are now permitted
--03T22::49.231+ I NETWORK [thread1] connection accepted from 192.168.244.30: # ( connections now open)
--03T22::49.236+ I NETWORK [conn9] received client metadata from 192.168.244.30: conn9: { driver: { name: "NetworkI
nterfaceASIO-RS", version: "3.4." }, os: { type: "Linux", name: "Red Hat Enterprise Linux Server release 6.7 (Santiago)", architecture: "x86_64", version: "Kernel 2.6.-.el6.x86_64" } }2017-05-03T22:24:49.317+0800 I NETWORK [thread1] connection accepted from 192.168.244.30:35838 #10 (4 connections now open)
--03T22::49.318+ I NETWORK [conn10] received client metadata from 192.168.244.30: conn10: { driver: { name: "Networ
kInterfaceASIO-RS", version: "3.4." }, os: { type: "Linux", name: "Red Hat Enterprise Linux Server release 6.7 (Santiago)", architecture: "x86_64", version: "Kernel 2.6.-.el6.x86_64" } }

原Primary node3:27018降低为Secondary

--03T22::36.262+ I COMMAND  [conn7] Attempting to step down in response to replSetStepDown command
--03T22::36.303+ I REPL [conn7] transition to SECONDARY
--03T22::36.315+ I NETWORK [conn7] legacy transport layer closing all connections
--03T22::36.316+ I NETWORK [conn7] Skip closing connection for connection #
--03T22::36.316+ I NETWORK [conn7] Skip closing connection for connection #
--03T22::36.316+ I NETWORK [conn7] Skip closing connection for connection #
--03T22::36.316+ I NETWORK [conn7] Skip closing connection for connection #
--03T22::36.316+ I NETWORK [conn7] Skip closing connection for connection #
--03T22::36.316+ I NETWORK [conn7] Skip closing connection for connection #
--03T22::36.382+ I NETWORK [thread1] connection accepted from 127.0.0.1: # ( connections now open)
--03T22::36.383+ I NETWORK [conn8] received client metadata from 127.0.0.1: conn8: { application: { name: "MongoDB
Shell" }, driver: { name: "MongoDB Internal Client", version: "3.4." }, os: { type: "Linux", name: "Red Hat Enterprise Linux Server release 6.7 (Santiago)", architecture: "x86_64", version: "Kernel 2.6.-.el6.x86_64" } }2017-05-03T22:24:36.408+0800 I - [conn7] AssertionException handling request, closing client connection: 172 Operation attempt
ed on a closed transport Session.--03T22::36.408+ I - [conn7] end connection 127.0.0.1: ( connections now open)
--03T22::41.262+ I COMMAND [conn5] command local.oplog.rs command: find { find: "oplog.rs", filter: { ts: { $gte: Timest
amp | } }, tailable: true, oplogReplay: true, awaitData: true, maxTimeMS: , term: } planSummary: COLLSCAN cursorid: keysExamined: docsExamined: writeConflicts: numYields: nreturned: reslen: locks:{ Global: { acquireCount: { r: } }, Database: { acquireCount: { r: } }, oplog: { acquireCount: { r: } } } protocol:op_command 100ms2017--03T22::48.311+ I REPL [ReplicationExecutor] Member node3: is now in state PRIMARY
--03T22::49.163+ I REPL [rsBackgroundSync] sync source candidate: node3:
--03T22::49.164+ I ASIO [NetworkInterfaceASIO-RS-] Connecting to node3:
--03T22::49.236+ I ASIO [NetworkInterfaceASIO-RS-] Successfully connected to node3:
--03T22::49.316+ I ASIO [NetworkInterfaceASIO-RS-] Connecting to node3:
--03T22::49.318+ I ASIO [NetworkInterfaceASIO-RS-] Successfully connected to node3:
--03T22::41.020+ I - [conn4] end connection 192.168.244.30: ( connections now open)
--03T22::02.653+ I ASIO [NetworkInterfaceASIO-RS-] Connecting to node3:
--03T22::02.669+ I ASIO [NetworkInterfaceASIO-RS-] Successfully connected to node3:
--03T22::41.442+ I - [conn5] end connection 192.168.244.30: ( connections now open)

冻结Secondary节点

如果需要对Primary做一下维护,但是不希望在维护的这段时间内将其它Secondary节点选举为Primary节点,可以在每次Secondary节点上执行freeze命令,强制使它们始终处于Secondary节点状态。

myapp:SECONDARY> rs.freeze()

注:只能在Secondary节点上执行

myapp:PRIMARY> rs.freeze()
{
"ok" : ,
"errmsg" : "cannot freeze node when primary or running for election. state: Primary",
"code" : ,
"codeName" : "NotSecondary"
}

如果要解冻Secondary节点,只需执行

myapp:SECONDARY> rs.freeze()

强制Secondary节点进入维护模式

当Secondary节点进入到维护模式后,它的状态即转化为“RECOVERING”,在这个状态的节点,客户端不会发送读请求给它,同时它也不能作为复制源。

进入维护模式有两种触发方式:

1. 自动触发

譬如Secondary上执行压缩

2. 手动触发

myapp:SECONDARY> db.adminCommand({"replSetMaintenance":true})

修改副本集的配置

添加节点

myapp:PRIMARY> rs.add("node3:27017")
myapp:PRIMARY> rs.add({_id: , host: "node3:27017", priority: , hidden: true})

也可通过配置文件的方式

> cfg={
"_id" : ,
"host" : "node3:27017",
"arbiterOnly" : false,
"buildIndexes" : true,
"hidden" : true,
"priority" : ,
"tags" : { },
"slaveDelay" : NumberLong(),
"votes" :
}
> rs.add(cfg)

删除节点

第一种方式

myapp:PRIMARY> rs.remove("node3:27017")

第二种方式

myapp:PRIMARY> cfg = rs.conf()
myapp:PRIMARY> cfg.members.splice(,)
myapp:PRIMARY> rs.reconfig(cfg)

注:执行rs.reconfig并不必然带来副本集的重新选举,加force参数同样如此。

The rs.reconfig() shell method can trigger the current primary to step down in some situations. 

修改节点的配置

将Secondary节点设置为延迟备份节点

cfg = rs.conf()
cfg.members[].priority =
cfg.members[].hidden = true
cfg.members[].slaveDelay =
rs.reconfig(cfg)

将Secondary节点设置为隐藏节点

cfg = rs.conf()
cfg.members[].priority =
cfg.members[].hidden = true
rs.reconfig(cfg)

替换当前的副本集成员

cfg = rs.conf()
cfg.members[].host = "mongo2.example.net"
rs.reconfig(cfg)

设置副本集节点的优先级

cfg = rs.conf()
cfg.members[].priority = 0.5
cfg.members[].priority =
cfg.members[].priority =
rs.reconfig(cfg)

优先级的有效取值是0~1000,可为小数,默认为1

从MongoDB 3.2开始

Non-voting members must have priority of .
Members with priority greater than cannot have votes.

注:如果将当前Secondary节点的优先级设置的大于Primary节点的优先级,会导致当前Primary节点的退位。

阻止Secondary节点升级为Primary节点

只需将priority设置为0

fg = rs.conf()
cfg.members[].priority =
rs.reconfig(cfg)

如何设置没有投票权的Secondary节点

MongoDB限制一个副本集最多只能拥有50个成员节点,其中,最多只有7个成员节点拥有投票权。

之所以作此限制,主要是考虑到心跳请求导致的网络流量,毕竟每个成员都要向其它所有成员发送心跳请求,和选举花费的时间。

从MongoDB 3.2开始,任何priority大于0的节点都不可将votes设置为0

所以,对于没有投票权的Secondary节点,votes和priority必须同时设置为0

cfg = rs.conf()
cfg.members[].votes =
cfg.members[].priority =
cfg.members[].votes =
cfg.members[].priority =
rs.reconfig(cfg)

禁用chainingAllowed

默认情况下,允许级联复制。

即备份集中如果新添加了一个节点,这个节点很可能是从其中一个Secondary节点处进行复制,而不是从Primary节点处复制。

MongoDB根据ping时间选择同步源,一个节点向另一个节点发送心跳请求,就可以得知心跳请求所耗费的时间。MongoDB维护着不同节点间心跳请求的平均花费时间,选择同步源时,会选择一个离自己比较近而且数据比自己新的节点。

如何判断节点是从哪个节点处进行复制的呢?

myapp:PRIMARY> rs.status().members[].syncingTo
node3:

当然,级联复制也有显而易见的缺点:复制链越长,将写操作复制到所有Secondary节点所花费的时间就越长。

可通过如下方式禁用

cfg=rs.conf()
cfg.settings.chainingAllowed=false
rs.reconfig(cfg)

将chainingAllowed设置为false后,所有Secondary节点都会从Primary节点复制数据。

为Secondary节点显式指定复制源

rs.syncFrom("node3:27019")

禁止Secondary节点创建索引

有时,并不需要Secondary节点拥有和Primary节点相同的索引,譬如这个节点只是用来处理数据备份或者离线的批量任务。这个时候,就可以阻止Secondary节点创建索引。

在MongoDB 3.4版本中,不允许直接修改,只能在添加节点时显式指定

myapp:PRIMARY> cfg=rs.conf()
myapp:PRIMARY> cfg.members[].buildIndexes=false
false
myapp:PRIMARY> rs.reconfig(cfg)
{
"ok" : ,
"errmsg" : "priority must be 0 when buildIndexes=false",
"code" : ,
"codeName" : "NewReplicaSetConfigurationIncompatible"
}
myapp:PRIMARY> cfg.members[].buildIndexes=false
false
myapp:PRIMARY> cfg.members[].priority= myapp:PRIMARY> rs.reconfig(cfg)
{
"ok" : ,
"errmsg" : "New and old configurations differ in the setting of the buildIndexes field for member node3:27017; to make this c
hange, remove then re-add the member", "code" : 103,
"codeName" : "NewReplicaSetConfigurationIncompatible"
}
myapp:PRIMARY> rs.remove("node3:27017")
{ "ok" : }
myapp:PRIMARY> rs.add({_id: , host: "node3:27017", priority: , buildIndexes:false})
{ "ok" : }

从上述测试中可以看出,如果要将节点的buildIndexes设置为false,必须同时将priority设置为0。

参考

1.《MongoDB权威指南》

2. MongoDB官方文档

MongoDB副本集的常用操作及原理的更多相关文章

  1. MongoDB 副本集的常用操作及原理

    本文是对MongoDB副本集常用操作的一个汇总,同时也穿插着介绍了操作背后的原理及注意点. 结合之前的文章:MongoDB副本集的搭建,大家可以在较短的时间内熟悉MongoDB的搭建和管理. 下面的操 ...

  2. MongoDB分片集群常用操作

    下架主节点: db.adminCommand({replSetStepDown : 1, force : true}) 删除节点: rs.remove("IP:PORT") 新增节 ...

  3. mongodb副本集原理及部署记录

    工作原理 1.副本集之间的复制是通过oplog日志现实的.备份节点通过查询这个集合就可以知道需要进行复制的操作 2.oplog是节点中local库中的一个固定的集合,在默认情况下oplog初始化大小为 ...

  4. MongoDB 副本集的原理、搭建、应用

    概念: 在了解了这篇文章之后,可以进行该篇文章的说明和测试.MongoDB 副本集(Replica Set)是有自动故障恢复功能的主从集群,有一个Primary节点和一个或多个Secondary节点组 ...

  5. MongoDB副本集(一主一备+仲裁)环境部署-运维操作记录

    MongoDB复制集是一个带有故障转移的主从集群.是从现有的主从模式演变而来,增加了自动故障转移和节点成员自动恢复.MongoDB复制集模式中没有固定的主结点,在启动后,多个服务节点间将自动选举产生一 ...

  6. MongoDB副本集的原理,搭建

    介绍: mongodb副本集即客户端连接到整个副本集,不关心具体哪一台机器是否挂掉.主服务器负责整个副本集的读写,副本集定期同步数据备份,一旦主节点挂掉,副本节点就会选举一个新的主服务器,这一切对于应 ...

  7. MongoDB 副本集的用户和权限一般操作步骤

    步骤总结: 在主节点上添加超管用户,副本集会自动同步 按照仲裁者.副本节点.主节点的先后顺序关闭所有节点服务 创建副本集认证的key文件,复制到每个服务所在目录 修改每个服务的配置文件,增加参数 启动 ...

  8. 创建mongodb副本集操作实例

    一:概念 相关概念及图片引用自这里 mongodb副本集: 副本集是一组服务器,其中一个是主服务器,用于处理客户请求:还有多个备份服务器,用于保存主服务器的数据副本.如果主服务器崩溃了,备份服务器自动 ...

  9. nodejs+mongoose操作mongodb副本集实例

    继上一篇设置mongodb副本集之后,开始使用nodejs访问mongodb副本集: 1:创建项目     express 项目名称 2:npm install mongoose    安装mongo ...

随机推荐

  1. 《Machine Learning》系列学习笔记之第三周

    第三周 第一部分 Classification and Representation Classification 为了尝试分类,一种方法是使用线性回归,并将大于0.5的所有预测映射为1,所有小于0. ...

  2. java 学习 todoList

    1.并发包的使用 2.线程相关的源码,怎么结束一个线程 3.单例模式代码 4.mixin 相关的理解代码 书单: effective java java 编程思想 spring 编程指南 深入理解jv ...

  3. 【Java基础】通过getResourceAsStream() 加载资源文件

    Class.getResourceAsStream(String path) path不以"/"开头时,默认是从当前类所在的包下面获取资源 path以"/"开头 ...

  4. Kafka-4614问题复盘 (MappedByteBuffer未关闭导致慢磁盘访问)

    很早之前就想动笔就这个kafka bug总结一番了,只是这个问题既不是本人发现,也不是自己动手修复,终归是底气不足,故而一直耽搁下来.怎奈此问题实在是含金量十足,又恰逢最近有人询问Kafka 0.10 ...

  5. React虚拟DOM具体实现——利用节点json描述还原dom结构

    前两天,帮朋友解决一个问题: ajax请求得到的数据,是一个对象数组,每个对象中,具有三个属性,parentId,id,name,然后根据这个数据生成对应的结构. 刚好最近在看React,并且了解到其 ...

  6. 免费SSL证书PK付费SSL证书 花落谁家

    3月17日和18日,Google Chrome 57.0.2987.110与Mozilla Firefox 52.0.1分别上线,而这两款浏览器都出现了一个共同点:打压HTTP协议.在Firefox ...

  7. 【C++】指针与引用的区别

    本文主要总结在C++中指针与引用的区别. 从定义与性质来看指针与引用有如下区别: 指针表示的是一块变量的地址 引用表示一个变量的别名. 因此指针变量需要占用空间(一个指针变量在32位系统下占用4字节, ...

  8. C语言精神

    国际标准化组织与1990年发布了第一个ANSI/ISO C标准 在该委员会制定的指导原则中,最有趣的可能是:保持C的精神.委员会在表达这一精神时列出了一下几点: 信任程序员: 不要妨碍程序员做需要做的 ...

  9. Android 瘦身之道 ---- so文件

    Android 瘦身之道 ---- so文件 [TOC] 1. 前言 目前Android 瘦身只有几个方面可以入手,因为apk的结构就已经固定了. res 目录下的资源文件.(通常是压缩图片,比如 矢 ...

  10. 重新认识JavaScript里的创建对象(一)

    一.序 面向对象有一个标志,那就是它们都有类的概念,而通过类可以创建任意多个具有相同属性和方法的对象.ECMA-262把对象定义为"无序属性的集合,其属性可以包含基本值.对象或者函数&quo ...