副本集是mongodb提供的一种高可用解决方案。相对于原来的主从复制,副本集能自动感知primary节点的下线,并提升其中一个Secondary作为Primary。

整个过程对业务透明,同时也大大降低了运维的成本。

架构图如下:

MongoDB副本集的角色

1. Primary

默认情况下,读写都是在Primary上操作的。

2. Secondary

通过oplog来重放Primary上的所有操作,拥有Primary节点数据的完整拷贝。

默认情况下,不可写,也不可读。

根据不同的需求,Secondary又可配置为如下形式:

1> Priority 0 Replica Set Members

优先级为0的节点,优先级为0的成员永远不会被选举为primary。

在mongoDB副本集中,允许给不同的节点设置不同的优先级。

优先级的取值范围为0-1000,可设置为浮点数,默认为1。

拥有最高优先级的成员会优先选举为primary。

譬如,在副本集中添加了一个优先级为2的成员node3:27020,而其它成员的优先级为1,只要node3:27020拥有最新的数据,那么当前的primary就会自动降

级,node3:27020将会被选举为新的primary节点,但如果node3:27020中的数据不够新,则当前primary节点保持不变,直到node3:27020的数据更新到最新。

2> Hidden Replica Set Members-隐藏节点

隐藏节点的优先级同样为0,同时对客户端不可见

使用rs.status()和rs.config()可以看到隐藏节点,但是对于db.isMaster()不可见。客户端连接到副本集时,会调用db.isMaster()命令来查看可用成员信息。

所以,隐藏节点不会受到客户端的读请求。

隐藏节点常用于执行特定的任务,譬如报表,备份。

3> Delayed Replica Set Members-延迟节点

延迟节点会比primary节点延迟指定的时间(通过slaveDelay参数来指定)

延迟节点必须是隐藏节点。

3. Arbiter

仲裁节点,只是用来投票,且投票的权重只能为1,不复制数据,也不能提升为primary。

仲裁节点常用于节点数量是偶数的副本集中。

建议:通常将Arbiter部署在业务服务器上,切忌将其部署在Primary节点或Secondary节点服务器上。

注:一个副本集最多有50个成员节点,7个投票节点。

MongoDB副本集的搭建

创建数据目录

# mkdir -p /data/27017

# mkdir -p /data/27018

# mkdir -p /data/27019

为了便于查看运行过程中的日志信息,为每个实例创建单独的日志文件

# mkdir -p /var/log/mongodb/

启动mongod实例

# mongod --replSet myapp --dbpath /data/27017 --port 27017 --logpath /var/log/mongodb/27017.log --fork

# mongod --replSet myapp --dbpath /data/27018 --port 27018 --logpath /var/log/mongodb/27018.log --fork

# mongod --replSet myapp --dbpath /data/27019 --port 27019 --logpath /var/log/mongodb/27019.log --fork

以27017端口实例为例,其日志输出信息如下:

--02T14::22.745+ I CONTROL  [initandlisten] MongoDB starting : pid= port= dbpath=/data/ -bit host=node3
--02T14::22.745+ I CONTROL [initandlisten] db version v3.4.2
--02T14::22.745+ I CONTROL [initandlisten] git version: 3f76e40c105fc223b3e5aac3e20dcd026b83b38b
--02T14::22.745+ I CONTROL [initandlisten] OpenSSL version: OpenSSL 1.0.1e-fips Feb
--02T14::22.745+ I CONTROL [initandlisten] allocator: tcmalloc
--02T14::22.745+ I CONTROL [initandlisten] modules: none
--02T14::22.745+ I CONTROL [initandlisten] build environment:
--02T14::22.745+ I CONTROL [initandlisten] distmod: rhel62
--02T14::22.745+ I CONTROL [initandlisten] distarch: x86_64
--02T14::22.745+ I CONTROL [initandlisten] target_arch: x86_64
--02T14::22.745+ I CONTROL [initandlisten] options: { net: { port: }, processManagement: { fork: true }, replication: { replSet: "myapp" }, storage: { dbPath: "/data/27017" }, systemLog: { destination: "file", path: "/var/log/mongodb/27017.log" } }
--02T14::22.768+ I - [initandlisten] --02T14::22.768+ I STORAGE [initandlisten] ** WARNING: Using the XFS filesystem is strongly recommended with the WiredTiger storage engine2017--02T14::22.768+ I STORAGE [initandlisten] ** See http://dochub.mongodb.org/core/prodnotes-filesystem
--02T14::22.769+ I STORAGE [initandlisten] wiredtiger_open config: create,cache_size=256M,session_max=,eviction=(threads_max=),config_base=false,statistics=(fast),log=(enabled=true,archive=true,path=journal,compressor=snappy),file_manager=(close_idle_time=),checkpoint=(wait=,log_size=2GB),statistics_log=(wait=),--02T14::24.450+ I CONTROL [initandlisten]
--02T14::24.482+ I CONTROL [initandlisten] ** WARNING: Access control is not enabled for the database.
--02T14::24.482+ I CONTROL [initandlisten] ** Read and write access to data and configuration is unrestricted.
--02T14::24.482+ I CONTROL [initandlisten] ** WARNING: You are running this process as the root user, which is not recommended.--02T14::24.482+ I CONTROL [initandlisten]
--02T14::24.516+ I FTDC [initandlisten] Initializing full-time diagnostic data capture with directory '/data/27017/diagnostic.data'--02T14::24.517+ I REPL [initandlisten] Did not find local voted for document at startup.
--02T14::24.517+ I REPL [initandlisten] Did not find local replica set configuration document at startup; NoMatchingDocument: Did not find replica set configuration document in local.system.replset
--02T14::24.519+ I NETWORK [thread1] waiting for connections on port

通过mongo连接副本集任一成员,在这里,连接27017端口实例

# mongo

初始化副本集

> rs.initiate()
{
"info2" : "no configuration specified. Using a default configuration for the set",
"me" : "node3:27017",
"ok" :
}

可通过rs.conf()查看当前副本集的配置信息,

myapp:PRIMARY> rs.conf()
{
"_id" : "myapp",
"version" : ,
"protocolVersion" : NumberLong(),
"members" : [
{
"_id" : ,
"host" : "node3:27017",
"arbiterOnly" : false,
"buildIndexes" : true,
"hidden" : false,
"priority" : ,
"tags" : { },
"slaveDelay" : NumberLong(),
"votes" :
}
],
"settings" : {
"chainingAllowed" : true,
"heartbeatIntervalMillis" : ,
"heartbeatTimeoutSecs" : ,
"electionTimeoutMillis" : ,
"catchUpTimeoutMillis" : ,
"getLastErrorModes" : { },
"getLastErrorDefaults" : {
"w" : ,
"wtimeout" :
},
"replicaSetId" : ObjectId("59082229517dd35bb9fd0d2a")
}
}

其中,settings中的选项解释如下:

chainingAllowed:是否允许级联复制

heartbeatIntervalMillis:心跳检测时间,默认是2s

heartbeatTimeoutSecs:心跳检测失效时间,默认为10s,即如果在10s内没有收到节点的心跳信息,则判断节点不可达(HostUnreachable),对primary和Secondary均适用。

日志输出信息如下:

# vim /var/log/mongodb/27017.log

--02T14::47.361+ I NETWORK  [thread1] connection accepted from 127.0.0.1: # ( connection now open)
--02T14::47.361+ I NETWORK [conn1] received client metadata from 127.0.0.1: conn1: { application: { name: "MongoDB Shell" }, driver: { name: "MongoDB Internal Client", version: "3.4.2" }, os: { type: "Linux", name: "Red Hat Enterprise Linux Server release 6.7 (Santiago)", architecture: "x86_64", version: "Kernel 2.6.32-573.el6.x86_64" } }
--02T14::36.737+ I COMMAND [conn1] initiate : no configuration specified. Using a default configuration for the set
--02T14::36.737+ I COMMAND [conn1] created this configuration for initiation : { _id: "myapp", version: , members: [ { _id: , host: "node3:27017" } ] }
--02T14::36.900+ I REPL [conn1] replSetInitiate admin command received from client
--02T14::37.391+ I REPL [conn1] replSetInitiate config object with members parses ok
--02T14::37.410+ I REPL [conn1] ******
--02T14::37.410+ I REPL [conn1] creating replication oplog of size: 990MB...
--02T14::37.439+ I STORAGE [conn1] Starting WiredTigerRecordStoreThread local.oplog.rs
--02T14::37.440+ I STORAGE [conn1] The size storer reports that the oplog contains records totaling to bytes
--02T14::37.440+ I STORAGE [conn1] Scanning the oplog to determine where to place markers for truncation
--02T14::37.472+ I REPL [conn1] ******
--02T14::37.568+ I INDEX [conn1] build index on: admin.system.version properties: { v: , key: { version: }, name: "incompatible_with_version_32", ns: "admin.system.version" }
--02T14::37.568+ I INDEX [conn1] building index using bulk method; build may temporarily use up to megabytes of RAM
--02T14::37.581+ I INDEX [conn1] build index done. scanned total records. secs
--02T14::37.591+ I COMMAND [conn1] setting featureCompatibilityVersion to 3.4
--02T14::37.601+ I REPL [conn1] New replica set config in use: { _id: "myapp", version: , protocolVersion: , members: [ { _id: , host: "node3:27017", arbiterOnly: false, buildIndexes: true, hidden: false, priority: 1.0, tags: {}, slaveDelay: , votes: } ], settings: { chainingAllowed: true, heartbeatIntervalMillis: , heartbeatTimeoutSecs: , electionTimeoutMillis: , catchUpTimeoutMillis: , getLastErrorModes: {}, getLastErrorDefaults: { w: , wtimeout: }, replicaSetId: ObjectId('59082229517dd35bb9fd0d2a') } }
--02T14::37.601+ I REPL [conn1] This node is node3: in the config
--02T14::37.601+ I REPL [conn1] transition to STARTUP2
--02T14::37.601+ I REPL [conn1] Starting replication storage threads
--02T14::37.603+ I REPL [conn1] Starting replication fetcher thread
--02T14::37.617+ I REPL [conn1] Starting replication applier thread
--02T14::37.617+ I REPL [conn1] Starting replication reporter thread
--02T14::37.617+ I REPL [rsSync] transition to RECOVERING
--02T14::37.628+ I REPL [rsSync] transition to SECONDARY
--02T14::37.635+ I COMMAND [conn1] command local.replset.minvalid appName: "MongoDB Shell" command: replSetInitiate { v: , key: { version: }, ns: "admin.system.version", name: "incompatible_with_version_32" } numYields: reslen: locks:{ Global: { acquireCount: { r: , w: , W: }, acquireWaitCount: { W: }, timeAcquiringMicros: { W: } }, Database: { acquireCount: { r: , w: , W: } }, Collection: { acquireCount: { r: , w: } }, Metadata: { acquireCount: { w: } }, oplog: { acquireCount: { w: } } } protocol:op_command 941ms
--02T14::37.646+ I REPL [rsSync] conducting a dry run election to see if we could be elected
--02T14::37.646+ I REPL [ReplicationExecutor] dry election run succeeded, running for election
--02T14::37.675+ I REPL [ReplicationExecutor] election succeeded, assuming primary role in term
--02T14::37.675+ I REPL [ReplicationExecutor] transition to PRIMARY
--02T14::37.675+ I REPL [ReplicationExecutor] Could not access any nodes within timeout when checking for additional ops to apply before finishing transition to primary. Will move forward with becoming primary anyway.
--02T14::38.687+ I REPL [rsSync] transition to primary complete; database writes are now permitted

添加节点

myapp:PRIMARY> rs.add("node3:27018")
{ "ok" : }

27017端口实例的日志信息如下:

--02T15::44.737+ I COMMAND  [conn1] command local.system.replset appName: "MongoDB Shell" command: count { count: "system.replset", query: {}, fields: {} } planSummary: COUNT keysExamined: docsExamined: numYields: reslen: locks:{ Global: { acquireCount: { r:  } }, Database: { acquireCount: { r:  } }, Collection: { acquireCount: { r:  } } } protocol:op_command 135ms
--02T15::44.765+ I REPL [conn1] replSetReconfig admin command received from client
--02T15::44.808+ I REPL [conn1] replSetReconfig config object with members parses ok
--02T15::44.928+ I ASIO [NetworkInterfaceASIO-Replication-] Connecting to node3:
--02T15::44.979+ I ASIO [NetworkInterfaceASIO-Replication-] Successfully connected to node3:
--02T15::44.994+ I NETWORK [thread1] connection accepted from 192.168.244.30: # ( connections now open)
--02T15::45.007+ I NETWORK [conn3] received client metadata from 192.168.244.30: conn3: { driver: { name: "NetworkInterfaceASIO-Replication", version: "3.4.2" }, os: { type: "Linux", name: "Red Hat Enterprise Linux Server release 6.7 (Santiago)", architecture: "x86_64", version: "Kernel 2.6.32-573.el6.x86_64" } }
--02T15::45.009+ I NETWORK [thread1] connection accepted from 192.168.244.30: # ( connections now open)
--02T15::45.010+ I - [conn4] end connection 192.168.244.30: ( connections now open)
--02T15::45.105+ I REPL [ReplicationExecutor] New replica set config in use: { _id: "myapp", version: , protocolVersion: , members: [ { _id: , host: "node3:27017", arbiterOnly: false, buildIndexes: true, hidden: false, priority: 1.0, tags: {}, slaveDelay: , votes: }, { _id: , host: "node3:27018", arbiterOnly: false, buildIndexes: true, hidden: false, priority: 1.0, tags: {}, slaveDelay: , votes: } ], settings: { chainingAllowed: true, heartbeatIntervalMillis: , heartbeatTimeoutSecs: , electionTimeoutMillis: , catchUpTimeoutMillis: , getLastErrorModes: {}, getLastErrorDefaults: { w: , wtimeout: }, replicaSetId: ObjectId('59082229517dd35bb9fd0d2a') } }
--02T15::45.105+ I REPL [ReplicationExecutor] This node is node3: in the config
--02T15::45.155+ I REPL [ReplicationExecutor] Member node3: is now in state STARTUP
--02T15::45.155+ I COMMAND [conn1] command local.system.replset appName: "MongoDB Shell" command: replSetReconfig { replSetReconfig: { _id: "myapp", version: , protocolVersion: , members: [ { _id: , host: "node3:27017", arbiterOnly: false, buildIndexes: true, hidden: false, priority: 1.0, tags: {}, slaveDelay: , votes: }, { _id: 1.0, host: "node3:27018" } ], settings: { chainingAllowed: true, heartbeatIntervalMillis: , heartbeatTimeoutSecs: , electionTimeoutMillis: , catchUpTimeoutMillis: , getLastErrorModes: {}, getLastErrorDefaults: { w: , wtimeout: }, replicaSetId: ObjectId('59082229517dd35bb9fd0d2a') } } } numYields: reslen: locks:{ Global: { acquireCount: { r: , w: , W: } }, Database: { acquireCount: { w: , W: } }, Metadata: { acquireCount: { w: } }, oplog: { acquireCount: { w: } } } protocol:op_command 403ms
--02T15::47.010+ I NETWORK [thread1] connection accepted from 192.168.244.30: # ( connections now open)
--02T15::47.011+ I - [conn5] end connection 192.168.244.30: ( connections now open)
--02T15::47.940+ I NETWORK [thread1] connection accepted from 192.168.244.30: # ( connections now open)
--02T15::47.941+ I NETWORK [conn6] received client metadata from 192.168.244.30: conn6: { driver: { name: "NetworkInterfaceASIO-RS", version: "3.4.2" }, os: { type: "Linux", name: "Red Hat Enterprise Linux Server release 6.7 (Santiago)", architecture: "x86_64", version: "Kernel 2.6.32-573.el6.x86_64" } }
--02T15::48.010+ I NETWORK [thread1] connection accepted from 192.168.244.30: # ( connections now open)
--02T15::48.011+ I NETWORK [conn7] received client metadata from 192.168.244.30: conn7: { driver: { name: "NetworkInterfaceASIO-RS", version: "3.4.2" }, os: { type: "Linux", name: "Red Hat Enterprise Linux Server release 6.7 (Santiago)", architecture: "x86_64", version: "Kernel 2.6.32-573.el6.x86_64" } }
--02T15::49.159+ I REPL [ReplicationExecutor] Member node3: is now in state SECONDARY
--02T15::49.160+ I - [conn6] end connection 192.168.244.30: ( connections now open)
--02T15::03.401+ I NETWORK [thread1] connection accepted from 192.168.244.30: # ( connections now open)
--02T15::03.403+ I NETWORK [conn8] received client metadata from 192.168.244.30: conn8: { driver: { name: "NetworkInterfaceASIO-RS", version: "3.4.2" }, os: { type: "Linux", name: "Red Hat Enterprise Linux Server release 6.7 (Santiago)", architecture: "x86_64", version: "Kernel 2.6.32-573.el6.x86_64" } }

27018端口实例的日志信息如下:

--02T15::44.796+ I NETWORK  [thread1] connection accepted from 192.168.244.30: # ( connection now open)
--02T15::44.922+ I - [conn2] end connection 192.168.244.30: ( connection now open)
--02T15::44.965+ I NETWORK [thread1] connection accepted from 192.168.244.30: # ( connection now open)
--02T15::44.978+ I NETWORK [conn3] received client metadata from 192.168.244.30: conn3: { driver: { name: "NetworkInterfaceASIO-Replication", version: "3.4.2" }, os: { type: "Linux", name: "Red Hat Enterprise Linux Server release 6.7 (Santiago)", architecture: "x86_64", version: "Kernel 2.6.32-573.el6.x86_64" } }
--02T15::44.991+ I ASIO [NetworkInterfaceASIO-Replication-] Connecting to node3:
--02T15::45.008+ I ASIO [NetworkInterfaceASIO-Replication-] Successfully connected to node3:
--02T15::47.101+ I REPL [replExecDBWorker-] Starting replication storage threads
--02T15::47.174+ I REPL [replication-] Starting initial sync (attempt of )
--02T15::47.174+ I REPL [ReplicationExecutor] New replica set config in use: { _id: "myapp", version: , protocolVersion: , members: [ { _id: , host: "node3:27017", arbiterOnly: false, buildIndexes: true, hidden: false, priority: 1.0, tags: {}, slaveDelay: , votes: }, { _id: , host: "node3:27018", arbiterOnly: false, buildIndexes: true, hidden: false, priority: 1.0, tags: {}, slaveDelay: , votes: } ], settings: { chainingAllowed: true, heartbeatIntervalMillis: , heartbeatTimeoutSecs: , electionTimeoutMillis: , catchUpTimeoutMillis: , getLastErrorModes: {}, getLastErrorDefaults: { w: , wtimeout: }, replicaSetId: ObjectId('59082229517dd35bb9fd0d2a') } }
--02T15::47.174+ I REPL [ReplicationExecutor] This node is node3: in the config
--02T15::47.174+ I REPL [ReplicationExecutor] transition to STARTUP2
--02T15::47.175+ I REPL [ReplicationExecutor] Member node3: is now in state PRIMARY
--02T15::47.217+ I REPL [replication-] sync source candidate: node3:
--02T15::47.217+ I STORAGE [replication-] dropAllDatabasesExceptLocal
--02T15::47.217+ I REPL [replication-] ******
--02T15::47.217+ I REPL [replication-] creating replication oplog of size: 990MB...
--02T15::47.232+ I STORAGE [replication-] Starting WiredTigerRecordStoreThread local.oplog.rs
--02T15::47.232+ I STORAGE [replication-] The size storer reports that the oplog contains records totaling to bytes
--02T15::47.232+ I STORAGE [replication-] Scanning the oplog to determine where to place markers for truncation
--02T15::47.938+ I REPL [replication-] ******
--02T15::47.939+ I ASIO [NetworkInterfaceASIO-RS-] Connecting to node3:
--02T15::47.941+ I ASIO [NetworkInterfaceASIO-RS-] Successfully connected to node3:
--02T15::48.010+ I ASIO [NetworkInterfaceASIO-RS-] Connecting to node3:
--02T15::48.011+ I ASIO [NetworkInterfaceASIO-RS-] Successfully connected to node3:
--02T15::48.046+ I REPL [replication-] CollectionCloner::start called, on ns:admin.system.version
--02T15::48.150+ I INDEX [InitialSyncInserters-admin.system.version0] build index on: admin.system.version properties: { v: , key: { version: }, name: "incompatible_with_version_32", ns: "admin.system.version" }
--02T15::48.150+ I INDEX [InitialSyncInserters-admin.system.version0] building index using bulk method; build may temporarily use up to megabytes of RAM
--02T15::48.154+ I INDEX [InitialSyncInserters-admin.system.version0] build index on: admin.system.version properties: { v: , key: { _id: }, name: "_id_", ns: "admin.system.version" }
--02T15::48.155+ I INDEX [InitialSyncInserters-admin.system.version0] building index using bulk method; build may temporarily use up to megabytes of RAM
--02T15::48.177+ I COMMAND [InitialSyncInserters-admin.system.version0] setting featureCompatibilityVersion to 3.4
--02T15::48.221+ I REPL [replication-] CollectionCloner::start called, on ns:test.blog
--02T15::48.264+ I INDEX [InitialSyncInserters-test.blog0] build index on: test.blog properties: { v: , key: { _id: }, name: "_id_", ns: "test.blog" }
--02T15::48.264+ I INDEX [InitialSyncInserters-test.blog0] building index using bulk method; build may temporarily use up to megabytes of RAM
--02T15::48.271+ I REPL [replication-] No need to apply operations. (currently at { : Timestamp | })
--02T15::48.271+ I REPL [replication-] Finished fetching oplog during initial sync: CallbackCanceled: Callback canceled. Last fetched optime and hash: { ts: Timestamp |, t: }[]
--02T15::48.271+ I REPL [replication-] Initial sync attempt finishing up.
--02T15::48.271+ I REPL [replication-] Initial Sync Attempt Statistics: { failedInitialSyncAttempts: , maxFailedInitialSyncAttempts: , initialSyncStart: new Date(), initialSyncAttempts: [], fetchedMissingDocs: , appliedOps: , initialSyncOplogStart: Timestamp |, initialSyncOplogEnd: Timestamp |, databases: { databasesCloned: , admin: { collections: , clonedCollections: , start: new Date(), end: new Date(), elapsedMillis: , admin.system.version: { documentsToCopy: , documentsCopied: , indexes: , fetchedBatches: , start: new Date(), end: new Date(), elapsedMillis: } }, test: { collections: , clonedCollections: , start: new Date(), end: new Date(), elapsedMillis: , test.blog: { documentsToCopy: , documentsCopied: , indexes: , fetchedBatches: , start: new Date(), end: new Date(), elapsedMillis: } } } }
--02T15::48.352+ I REPL [replication-] initial sync done; took 1s.
--02T15::48.352+ I REPL [replication-] Starting replication fetcher thread
--02T15::48.352+ I REPL [replication-] Starting replication applier thread
--02T15::48.352+ I REPL [replication-] Starting replication reporter thread
--02T15::48.352+ I REPL [rsSync] transition to RECOVERING
--02T15::48.366+ I REPL [rsBackgroundSync] could not find member to sync from
--02T15::48.367+ I REPL [rsSync] transition to SECONDARY
--02T15::03.392+ I REPL [rsBackgroundSync] sync source candidate: node3:
--02T15::03.396+ I ASIO [NetworkInterfaceASIO-RS-] Connecting to node3:
--02T15::03.404+ I ASIO [NetworkInterfaceASIO-RS-] Successfully connected to node3:

添加仲裁节点

myapp:PRIMARY> rs.addArb("node3:27019")
{ "ok" : }

27017端口实例的日志信息如下:

--02T16::59.098+ I REPL     [conn1] replSetReconfig admin command received from client
--02T16::59.116+ I REPL [conn1] replSetReconfig config object with members parses ok
--02T16::59.116+ I ASIO [NetworkInterfaceASIO-Replication-] Connecting to node3:
--02T16::59.123+ I REPL [ReplicationExecutor] New replica set config in use: { _id: "myapp", version: , protocolVersion: , members: [ { _id: , host: "node3:27017", arbiterOnly: false, buildIndexes: true, hidden: false, priority: 1.0, tags: {}, slaveDelay: , votes: }, { _id: , host: "node3:27018", arbiterOnly: false, buildIndexes: true, hidden: false, priority: 1.0, tags: {}, slaveDelay: , votes: }, { _id: , host: "node3:27019", arbiterOnly: true, buildIndexes: true, hidden: false, priority: 1.0, tags: {}, slaveDelay: , votes: } ], settings: { chainingAllowed: true, heartbeatIntervalMillis: , heartbeatTimeoutSecs: , electionTimeoutMillis: , catchUpTimeoutMillis: , getLastErrorModes: {}, getLastErrorDefaults: { w: , wtimeout: }, replicaSetId: ObjectId('59082229517dd35bb9fd0d2a') } }
--02T16::59.123+ I REPL [ReplicationExecutor] This node is node3: in the config
--02T16::59.124+ I ASIO [NetworkInterfaceASIO-Replication-] Connecting to node3:
--02T16::59.124+ I ASIO [NetworkInterfaceASIO-Replication-] Successfully connected to node3:
--02T16::59.125+ I NETWORK [thread1] connection accepted from 192.168.244.30: # ( connections now open)
--02T16::59.127+ I - [conn9] end connection 192.168.244.30: ( connections now open)
--02T16::59.131+ I REPL [ReplicationExecutor] Member node3: is now in state STARTUP
--02T16::59.137+ I ASIO [NetworkInterfaceASIO-Replication-] Successfully connected to node3:
--02T16::59.223+ I NETWORK [thread1] connection accepted from 192.168.244.30: # ( connections now open)
--02T16::59.225+ I NETWORK [conn10] received client metadata from 192.168.244.30: conn10: { driver: { name: "NetworkInterfaceASIO-Replication", version: "3.4.2" }, os: { type: "Linux", name: "Red Hat Enterprise Linux Server release 6.7 (Santiago)", architecture: "x86_64", version: "Kernel 2.6.32-573.el6.x86_64" } }
--02T16::59.231+ I NETWORK [thread1] connection accepted from 192.168.244.30: # ( connections now open)
--02T16::59.232+ I - [conn11] end connection 192.168.244.30: ( connections now open)
--02T16::01.132+ I REPL [ReplicationExecutor] Member node3: is now in state ARBITER

27019端口实例的日志信息如下:

--02T16::59.115+ I NETWORK  [thread1] connection accepted from 192.168.244.30: # ( connection now open)
--02T16::59.117+ I - [conn1] end connection 192.168.244.30: ( connection now open)
--02T16::59.117+ I NETWORK [thread1] connection accepted from 192.168.244.30: # ( connection now open)
--02T16::59.122+ I NETWORK [conn2] received client metadata from 192.168.244.30: conn2: { driver: { name: "NetworkInterfaceASIO-Replication", version: "3.4.2" }, os: { type: "Linux", name: "Red Hat Enterprise Linux Server release 6.7 (Santiago)", architecture: "x86_64", version: "Kernel 2.6.32-573.el6.x86_64" } }
--02T16::59.125+ I NETWORK [thread1] connection accepted from 192.168.244.30: # ( connections now open)
--02T16::59.127+ I NETWORK [thread1] connection accepted from 192.168.244.30: # ( connections now open)
--02T16::59.128+ I ASIO [NetworkInterfaceASIO-Replication-] Connecting to node3:
--02T16::59.135+ I - [conn4] end connection 192.168.244.30: ( connections now open)
--02T16::59.136+ I NETWORK [conn3] received client metadata from 192.168.244.30: conn3: { driver: { name: "NetworkInterfaceASIO-Replication", version: "3.4.2" }, os: { type: "Linux", name: "Red Hat Enterprise Linux Server release 6.7 (Santiago)", architecture: "x86_64", version: "Kernel 2.6.32-573.el6.x86_64" } }
--02T16::59.214+ I NETWORK [thread1] connection accepted from 192.168.244.30: # ( connections now open)
--02T16::59.216+ I NETWORK [conn5] received client metadata from 192.168.244.30: conn5: { driver: { name: "NetworkInterfaceASIO-Replication", version: "3.4.2" }, os: { type: "Linux", name: "Red Hat Enterprise Linux Server release 6.7 (Santiago)", architecture: "x86_64", version: "Kernel 2.6.32-573.el6.x86_64" } }
--02T16::59.219+ I ASIO [NetworkInterfaceASIO-Replication-] Connecting to node3:
--02T16::59.227+ I ASIO [NetworkInterfaceASIO-Replication-] Successfully connected to node3:
--02T16::59.227+ I ASIO [NetworkInterfaceASIO-Replication-] Successfully connected to node3:
--02T16::59.295+ I REPL [ReplicationExecutor] New replica set config in use: { _id: "myapp", version: , protocolVersion: , members: [ { _id: , host: "node3:27017", arbiterOnly: false, buildIndexes: true, hidden: false, priority: 1.0, tags: {}, slaveDelay: , votes: }, { _id: , host: "node3:27018", arbiterOnly: false, buildIndexes: true, hidden: false, priority: 1.0, tags: {}, slaveDelay: , votes: }, { _id: , host: "node3:27019", arbiterOnly: true, buildIndexes: true, hidden: false, priority: 1.0, tags: {}, slaveDelay: , votes: } ], settings: { chainingAllowed: true, heartbeatIntervalMillis: , heartbeatTimeoutSecs: , electionTimeoutMillis: , catchUpTimeoutMillis: , getLastErrorModes: {}, getLastErrorDefaults: { w: , wtimeout: }, replicaSetId: ObjectId('59082229517dd35bb9fd0d2a') } }
--02T16::59.295+ I REPL [ReplicationExecutor] This node is node3: in the config
--02T16::59.295+ I REPL [ReplicationExecutor] transition to ARBITER
--02T16::59.297+ I REPL [ReplicationExecutor] Member node3: is now in state PRIMARY
--02T16::59.297+ I REPL [ReplicationExecutor] Member node3: is now in state SECONDARY
--02T16::59.132+ I - [conn2] end connection 192.168.244.30: ( connections now open)

检查复制集的状态

myapp:PRIMARY> rs.status()
{
"set" : "myapp",
"date" : ISODate("2017-05-02T08:10:59.174Z"),
"myState" : ,
"term" : NumberLong(),
"heartbeatIntervalMillis" : NumberLong(),
"optimes" : {
"lastCommittedOpTime" : {
"ts" : Timestamp(, ),
"t" : NumberLong()
},
"appliedOpTime" : {
"ts" : Timestamp(, ),
"t" : NumberLong()
},
"durableOpTime" : {
"ts" : Timestamp(, ),
"t" : NumberLong()
}
},
"members" : [
{
"_id" : ,
"name" : "node3:27017",
"health" : ,
"state" : ,
"stateStr" : "PRIMARY",
"uptime" : ,
"optime" : {
"ts" : Timestamp(, ),
"t" : NumberLong()
},
"optimeDate" : ISODate("2017-05-02T08:10:49Z"),
"electionTime" : Timestamp(, ),
"electionDate" : ISODate("2017-05-02T06:07:37Z"),
"configVersion" : ,
"self" : true
},
{
"_id" : ,
"name" : "node3:27018",
"health" : ,
"state" : ,
"stateStr" : "SECONDARY",
"uptime" : ,
"optime" : {
"ts" : Timestamp(, ),
"t" : NumberLong()
},
"optimeDurable" : {
"ts" : Timestamp(, ),
"t" : NumberLong()
},
"optimeDate" : ISODate("2017-05-02T08:10:49Z"),
"optimeDurableDate" : ISODate("2017-05-02T08:10:49Z"),
"lastHeartbeat" : ISODate("2017-05-02T08:10:57.606Z"),
"lastHeartbeatRecv" : ISODate("2017-05-02T08:10:58.224Z"),
"pingMs" : NumberLong(),
"syncingTo" : "node3:27017",
"configVersion" :
},
{
"_id" : ,
"name" : "node3:27019",
"health" : ,
"state" : ,
"stateStr" : "ARBITER",
"uptime" : ,
"lastHeartbeat" : ISODate("2017-05-02T08:10:57.607Z"),
"lastHeartbeatRecv" : ISODate("2017-05-02T08:10:54.391Z"),
"pingMs" : NumberLong(),
"configVersion" :
}
],
"ok" :
}

副本集也可通过配置文件的方式进行创建

> cfg={
... "_id":"myapp",
... "members":[
... {"_id":,"host":"node3:27017"},
... {"_id":,"host":"node3:27018"},
... {"_id":,"host":"node3:27019","arbiterOnly" : true}
... ]} > rs.initiate(cfg)

验证副本集的可用性

在primary中创建一个集合,并插入一个文档进行测试

# mongo
myapp:PRIMARY> show dbs;
admin .000GB
local .000GB
myapp:PRIMARY> use test
switched to db test
myapp:PRIMARY> db.blog.insert({"title":"My Blog Post"})
WriteResult({ "nInserted" : })
myapp:PRIMARY> db.blog.find();
{ "_id" : ObjectId("59082731008c534e0763e90a"), "title" : "My Blog Post" }
myapp:PRIMARY> quit()

在secondary中进行验证

# mongo --port
myapp:SECONDARY> use test
switched to db test
myapp:SECONDARY> db.blog.find()
Error: error: {
"ok" : ,
"errmsg" : "not master and slaveOk=false",
"code" : ,
"codeName" : "NotMasterNoSlaveOk"
}
myapp:SECONDARY> rs.slaveOk()
myapp:SECONDARY> db.blog.find()
{ "_id" : ObjectId("59082731008c534e0763e90a"), "title" : "My Blog Post" }
myapp:SECONDARY> quit()

因仲裁节点实际上并不存储任何数据,所以无法通过连接仲裁节点查看刚刚插入的文档

# mongo --port
myapp:ARBITER> use test
switched to db test
myapp:ARBITER> db.blog.find();
Error: error: {
"ok" : ,
"errmsg" : "not master and slaveOk=false",
"code" : ,
"codeName" : "NotMasterNoSlaveOk"
}
myapp:ARBITER> rs.slaveOk()
myapp:ARBITER> db.blog.find()
Error: error: {
"ok" : ,
"errmsg" : "node is not in primary or recovering state",
"code" : ,
"codeName" : "NotMasterOrSecondary"
}
myapp:ARBITER> quit()

模拟primary宕掉后,副本集的自动切换

# ps -ef |grep mongodb
root : ? :: mongod --replSet myapp --dbpath /data/ --port --logpath /var/log/mongodb
/.log --forkroot : ? :: mongod --replSet myapp --dbpath /data/ --port --logpath /var/log/mongodb
/.log --forkroot : ? :: mongod --replSet myapp --dbpath /data/ --port --logpath /var/log/mongodb
/.log --forkroot : pts/ :: vim /var/log/mongodb/.log
root : pts/ :: tailf /var/log/mongodb/.log
root : pts/ :: tailf /var/log/mongodb/.log
root : pts/ :: grep mongodb
# kill -

检查复制集的状态

在这里,连接27018端口实例

# mongo --port 27018

myapp:PRIMARY> db.isMaster()
{
"hosts" : [
"node3:27017",
"node3:27018"
],
"arbiters" : [
"node3:27019"
],
"setName" : "myapp",
"setVersion" : ,
"ismaster" : true,
"secondary" : false,
"primary" : "node3:27018",
"me" : "node3:27018",
"electionId" : ObjectId("7fffffff0000000000000002"),
"lastWrite" : {
"opTime" : {
"ts" : Timestamp(, ),
"t" : NumberLong()
},
"lastWriteDate" : ISODate("2017-05-02T09:19:02Z")
},
"maxBsonObjectSize" : ,
"maxMessageSizeBytes" : ,
"maxWriteBatchSize" : ,
"localTime" : ISODate("2017-05-02T09:19:04.870Z"),
"maxWireVersion" : ,
"minWireVersion" : ,
"readOnly" : false,
"ok" :
}

可见,primary已经切换到27018端口实例上了。

对应的,27018端口实例的日志输出信息如下:

--02T17::51.853+ I -        [conn3] end connection 192.168.244.30: ( connections now open)
--02T17::51.853+ I REPL [replication-] Restarting oplog query due to error: HostUnreachable: End of file. Last fetched optime (with hash): { ts: Timestamp |, t: }[-]. Restarts remaining:
--02T17::51.878+ I ASIO [replication-] dropping unhealthy pooled connection to node3:
--02T17::51.878+ I ASIO [replication-] after drop, pool was empty, going to spawn some connections
--02T17::51.879+ I REPL [replication-] Scheduled new oplog query Fetcher source: node3: database: local query: { find: "oplog.rs", filter: { ts: { $gte: Timestamp | } }, tailable: true, oplogReplay: true, awaitData: true, maxTimeMS: , term: } query metadata: { $replData: , $ssm: { $secondaryOk: true } } active: timeout: 10000ms shutting down?: first: firstCommandScheduler: RemoteCommandRetryScheduler request: RemoteCommand -- target:node3: db:local cmd:{ find: "oplog.rs", filter: { ts: { $gte: Timestamp | } }, tailable: true, oplogReplay: true, awaitData: true, maxTimeMS: , term: } active: callbackHandle.valid: callbackHandle.cancelled: attempt: retryPolicy: RetryPolicyImpl maxAttempts: maxTimeMillis: -1ms
--02T17::51.879+ I ASIO [NetworkInterfaceASIO-RS-] Connecting to node3:
--02T17::51.879+ I ASIO [NetworkInterfaceASIO-RS-] Failed to connect to node3: - HostUnreachable: Connection refused
--02T17::51.880+ I REPL [replication-] Restarting oplog query due to error: HostUnreachable: Connection refused. Last fetched optime (with hash): { ts: Timestamp |, t: }[-]. Restarts remaining:
--02T17::51.880+ I REPL [replication-] Scheduled new oplog query Fetcher source: node3: database: local query: { find: "oplog.rs", filter: { ts: { $gte: Timestamp | } }, tailable: true, oplogReplay: true, awaitData: true, maxTimeMS: , term: } query metadata: { $replData: , $ssm: { $secondaryOk: true } } active: timeout: 10000ms shutting down?: first: firstCommandScheduler: RemoteCommandRetryScheduler request: RemoteCommand -- target:node3: db:local cmd:{ find: "oplog.rs", filter: { ts: { $gte: Timestamp | } }, tailable: true, oplogReplay: true, awaitData: true, maxTimeMS: , term: } active: callbackHandle.valid: callbackHandle.cancelled: attempt: retryPolicy: RetryPolicyImpl maxAttempts: maxTimeMillis: -1ms
--02T17::51.880+ I ASIO [NetworkInterfaceASIO-RS-] Connecting to node3:
--02T17::51.880+ I ASIO [NetworkInterfaceASIO-RS-] Failed to connect to node3: - HostUnreachable: Connection refused
--02T17::51.880+ I REPL [replication-] Restarting oplog query due to error: HostUnreachable: Connection refused. Last fetched optime (with hash): { ts: Timestamp |, t: }[-]. Restarts remaining:
--02T17::51.880+ I REPL [replication-] Scheduled new oplog query Fetcher source: node3: database: local query: { find: "oplog.rs", filter: { ts: { $gte: Timestamp | } }, tailable: true, oplogReplay: true, awaitData: true, maxTimeMS: , term: } query metadata: { $replData: , $ssm: { $secondaryOk: true } } active: timeout: 10000ms shutting down?: first: firstCommandScheduler: RemoteCommandRetryScheduler request: RemoteCommand -- target:node3: db:local cmd:{ find: "oplog.rs", filter: { ts: { $gte: Timestamp | } }, tailable: true, oplogReplay: true, awaitData: true, maxTimeMS: , term: } active: callbackHandle.valid: callbackHandle.cancelled: attempt: retryPolicy: RetryPolicyImpl maxAttempts: maxTimeMillis: -1ms
--02T17::51.880+ I ASIO [NetworkInterfaceASIO-RS-] Connecting to node3:
--02T17::51.883+ I ASIO [NetworkInterfaceASIO-RS-] Failed to connect to node3: - HostUnreachable: Connection refused
--02T17::51.884+ I REPL [replication-] Error returned from oplog query (no more query restarts left): HostUnreachable: Connection refused
--02T17::51.884+ W REPL [rsBackgroundSync] Fetcher stopped querying remote oplog with error: HostUnreachable: Connection refused
--02T17::51.884+ I REPL [rsBackgroundSync] could not find member to sync from
--02T17::51.884+ I ASIO [ReplicationExecutor] dropping unhealthy pooled connection to node3:
--02T17::51.884+ I ASIO [ReplicationExecutor] after drop, pool was empty, going to spawn some connections
--02T17::51.884+ I ASIO [NetworkInterfaceASIO-Replication-] Connecting to node3:
--02T17::51.885+ I ASIO [NetworkInterfaceASIO-Replication-] Failed to connect to node3: - HostUnreachable: Connection refused
--02T17::51.885+ I REPL [ReplicationExecutor] Error in heartbeat request to node3:; HostUnreachable: Connection refused
--02T17::51.885+ I ASIO [NetworkInterfaceASIO-Replication-] Connecting to node3:
--02T17::51.885+ I ASIO [NetworkInterfaceASIO-Replication-] Failed to connect to node3: - HostUnreachable: Connection refused
--02T17::51.885+ I REPL [ReplicationExecutor] Error in heartbeat request to node3:; HostUnreachable: Connection refused
--02T17::51.885+ I ASIO [NetworkInterfaceASIO-Replication-] Connecting to node3:
--02T17::51.885+ I ASIO [NetworkInterfaceASIO-Replication-] Failed to connect to node3: - HostUnreachable: Connection refused
--02T17::51.886+ I REPL [ReplicationExecutor] Error in heartbeat request to node3:; HostUnreachable: Connection refused
--02T17::54.837+ I REPL [SyncSourceFeedback] SyncSourceFeedback error sending update to node3:: InvalidSyncSource: Sync source was cleared. Was node3:
--02T17::56.886+ I ASIO [NetworkInterfaceASIO-Replication-] Connecting to node3:
--02T17::56.886+ I ASIO [NetworkInterfaceASIO-Replication-] Failed to connect to node3: - HostUnreachable: Connection refused
--02T17::56.886+ I REPL [ReplicationExecutor] Error in heartbeat request to node3:; HostUnreachable: Connection refused
--02T17::56.886+ I ASIO [NetworkInterfaceASIO-Replication-] Connecting to node3:
--02T17::56.887+ I ASIO [NetworkInterfaceASIO-Replication-] Failed to connect to node3: - HostUnreachable: Connection refused
--02T17::56.887+ I REPL [ReplicationExecutor] Error in heartbeat request to node3:; HostUnreachable: Connection refused
--02T17::56.887+ I ASIO [NetworkInterfaceASIO-Replication-] Connecting to node3:
--02T17::56.887+ I ASIO [NetworkInterfaceASIO-Replication-] Failed to connect to node3: - HostUnreachable: Connection refused
--02T17::56.887+ I REPL [ReplicationExecutor] Error in heartbeat request to node3:; HostUnreachable: Connection refused
--02T17::01.560+ I REPL [ReplicationExecutor] Starting an election, since we've seen no PRIMARY in the past 10000ms
--02T17::01.605+ I REPL [ReplicationExecutor] conducting a dry run election to see if we could be elected
--02T17::01.616+ I ASIO [NetworkInterfaceASIO-Replication-] Connecting to node3:
--02T17::01.626+ I ASIO [NetworkInterfaceASIO-Replication-] Failed to connect to node3: - HostUnreachable: Connection refused
--02T17::01.630+ I REPL [ReplicationExecutor] VoteRequester(term dry run) failed to receive response from node3:: HostUnreachable: Connection refused
--02T17::01.637+ I REPL [ReplicationExecutor] VoteRequester(term dry run) received a yes vote from node3:; response message: { term: , voteGranted: true, reason: "", ok: 1.0 }
--02T17::01.638+ I REPL [ReplicationExecutor] dry election run succeeded, running for election
--02T17::01.670+ I ASIO [NetworkInterfaceASIO-Replication-] Connecting to node3:
--02T17::01.670+ I ASIO [NetworkInterfaceASIO-Replication-] Failed to connect to node3: - HostUnreachable: Connection refused
--02T17::01.672+ I REPL [ReplicationExecutor] VoteRequester(term ) failed to receive response from node3:: HostUnreachable: Connection refused
--02T17::01.689+ I REPL [ReplicationExecutor] VoteRequester(term ) received a yes vote from node3:; response message: { term: , voteGranted: true, reason: "", ok: 1.0 }
--02T17::01.689+ I REPL [ReplicationExecutor] election succeeded, assuming primary role in term
--02T17::01.689+ I REPL [ReplicationExecutor] transition to PRIMARY
--02T17::01.691+ I ASIO [NetworkInterfaceASIO-Replication-] Connecting to node3:
--02T17::01.692+ I ASIO [NetworkInterfaceASIO-Replication-] Connecting to node3:
--02T17::01.692+ I ASIO [NetworkInterfaceASIO-Replication-] Connecting to node3:
--02T17::01.693+ I ASIO [NetworkInterfaceASIO-Replication-] Failed to connect to node3: - HostUnreachable: Connection refused
--02T17::01.693+ I REPL [ReplicationExecutor] My optime is most up-to-date, skipping catch-up and completing transition to primary.
--02T17::01.693+ I REPL [ReplicationExecutor] Error in heartbeat request to node3:; HostUnreachable: Connection refused
--02T17::01.693+ I ASIO [NetworkInterfaceASIO-Replication-] Failed to connect to node3: - HostUnreachable: Connection refused
--02T17::01.693+ I ASIO [NetworkInterfaceASIO-Replication-] Connecting to node3:
--02T17::01.694+ I ASIO [NetworkInterfaceASIO-Replication-] Failed to connect to node3: - HostUnreachable: Connection refused
--02T17::01.694+ I REPL [ReplicationExecutor] Error in heartbeat request to node3:; HostUnreachable: Connection refused
--02T17::01.694+ I ASIO [NetworkInterfaceASIO-Replication-] Connecting to node3:
--02T17::01.694+ I ASIO [NetworkInterfaceASIO-Replication-] Failed to connect to node3: - HostUnreachable: Connection refused
--02T17::01.694+ I REPL [ReplicationExecutor] Error in heartbeat request to node3:; HostUnreachable: Connection refused
--02T17::01.694+ I ASIO [NetworkInterfaceASIO-Replication-] Successfully connected to node3:
--02T17::02.094+ I REPL [rsSync] transition to primary complete; database writes are now permitted

从日志输出中可以看出,

在第一次探测到primary不可用时,mongodb会剔除掉不健康连接(dropping unhealthy pooled connection to node3:27017),然后继续探测,直到到达10s(heartbeatTimeoutSecs)的限制,此时进行primary的自动切换。

--02T17::01.560+ I REPL     [ReplicationExecutor] Starting an election, since we've seen no PRIMARY in the past 10000ms
--02T17::01.605+ I REPL [ReplicationExecutor] conducting a dry run election to see if we could be elected
--02T17::01.616+ I ASIO [NetworkInterfaceASIO-Replication-] Connecting to node3:
--02T17::01.626+ I ASIO [NetworkInterfaceASIO-Replication-] Failed to connect to node3: - HostUnreachable: Connection refused
--02T17::01.630+ I REPL [ReplicationExecutor] VoteRequester(term dry run) failed to receive response from node3:: HostUnreachable: Connection refused
--02T17::01.637+ I REPL [ReplicationExecutor] VoteRequester(term dry run) received a yes vote from node3:; response message: { term: , voteGranted: true, reason: "", ok: 1.0 }
--02T17::01.638+ I REPL [ReplicationExecutor] dry election run succeeded, running for election
--02T17::01.670+ I ASIO [NetworkInterfaceASIO-Replication-] Connecting to node3:
--02T17::01.670+ I ASIO [NetworkInterfaceASIO-Replication-] Failed to connect to node3: - HostUnreachable: Connection refused
--02T17::01.672+ I REPL [ReplicationExecutor] VoteRequester(term ) failed to receive response from node3:: HostUnreachable: Connection refused
--02T17::01.689+ I REPL [ReplicationExecutor] VoteRequester(term ) received a yes vote from node3:; response message: { term: , voteGranted: true, reason: "", ok: 1.0 }
--02T17::01.689+ I REPL [ReplicationExecutor] election succeeded, assuming primary role in term
--02T17::01.689+ I REPL [ReplicationExecutor] transition to PRIMARY

实际上,在27017端口实例宕掉的过程中,其它两个节点均会继续针对27017端口实例进行心跳检测

--02T17::08.384+ I ASIO     [NetworkInterfaceASIO-Replication-] Connecting to node3:
--02T17::08.384+ I ASIO [NetworkInterfaceASIO-Replication-] Failed to connect to node3: - HostUnreachable: Conn
ection refused
2017--02T17::08.384+ I REPL [ReplicationExecutor] Error in heartbeat request to node3:; HostUnreachable: Connection
refused

当27017端口实例重新上线时,会自动以Secondary角色加入到副本集中

27017端口实例启动并重新加入副本集的日志信息输出如下:

--02T17::10.616+ I CONTROL  [initandlisten] MongoDB starting : pid= port= dbpath=/data/ -bit host=node3
--02T17::10.616+ I CONTROL [initandlisten] db version v3.4.2
--02T17::10.616+ I CONTROL [initandlisten] git version: 3f76e40c105fc223b3e5aac3e20dcd026b83b38b
--02T17::10.616+ I CONTROL [initandlisten] OpenSSL version: OpenSSL 1.0.1e-fips Feb
--02T17::10.616+ I CONTROL [initandlisten] allocator: tcmalloc
--02T17::10.616+ I CONTROL [initandlisten] modules: none
--02T17::10.616+ I CONTROL [initandlisten] build environment:
--02T17::10.616+ I CONTROL [initandlisten] distmod: rhel62
--02T17::10.616+ I CONTROL [initandlisten] distarch: x86_64
--02T17::10.616+ I CONTROL [initandlisten] target_arch: x86_64
--02T17::10.616+ I CONTROL [initandlisten] options: { net: { port: }, processManagement: { fork: true }, replication: { replSet: "myapp" }, storage: { dbPath: "/data/27017" }, systemLog: { destination: "file", path: "/var/log/mongodb/27017.log" } }
--02T17::10.616+ W - [initandlisten] Detected unclean shutdown - /data//mongod.lock is not empty.
--02T17::10.645+ I - [initandlisten] Detected data files in /data/ created by the 'wiredTiger' storage engine, so setting the active storage engine to 'wiredTiger'.
--02T17::10.645+ W STORAGE [initandlisten] Recovering data from the last clean checkpoint.
--02T17::10.645+ I STORAGE [initandlisten]
--02T17::10.645+ I STORAGE [initandlisten] ** WARNING: Using the XFS filesystem is strongly recommended with the WiredTiger storage engine
--02T17::10.645+ I STORAGE [initandlisten] ** See http://dochub.mongodb.org/core/prodnotes-filesystem
--02T17::10.645+ I STORAGE [initandlisten] wiredtiger_open config: create,cache_size=256M,session_max=,eviction=(threads_max=),config_base=false,statistics=(fast),log=(enabled=true,archive=true,path=journal,compressor=snappy),file_manager=(close_idle_time=),checkpoint=(wait=,log_size=2GB),statistics_log=(wait=),
--02T17::11.402+ I STORAGE [initandlisten] Starting WiredTigerRecordStoreThread local.oplog.rs
--02T17::11.436+ I STORAGE [initandlisten] The size storer reports that the oplog contains records totaling to bytes
--02T17::11.436+ I STORAGE [initandlisten] Scanning the oplog to determine where to place markers for truncation
--02T17::11.502+ I CONTROL [initandlisten]
--02T17::11.502+ I CONTROL [initandlisten] ** WARNING: Access control is not enabled for the database.
--02T17::11.502+ I CONTROL [initandlisten] ** Read and write access to data and configuration is unrestricted.
--02T17::11.502+ I CONTROL [initandlisten] ** WARNING: You are running this process as the root user, which is not recommended.
--02T17::11.502+ I CONTROL [initandlisten]
--02T17::11.675+ I FTDC [initandlisten] Initializing full-time diagnostic data capture with directory '/data/27017/diagnostic.data'
--02T17::11.744+ I NETWORK [thread1] waiting for connections on port
--02T17::11.797+ I REPL [replExecDBWorker-] New replica set config in use: { _id: "myapp", version: , protocolVersion: , members: [ { _id: , host: "node3:27017", arbiterOnly: false, buildIndexes: true, hidden: false, priority: 1.0, tags: {}, slaveDelay: , votes: }, { _id: , host: "node3:27018", arbiterOnly: false, buildIndexes: true, hidden: false, priority: 1.0, tags: {}, slaveDelay: , votes: }, { _id: , host: "node3:27019", arbiterOnly: true, buildIndexes: true, hidden: false, priority: 1.0, tags: {}, slaveDelay: , votes: } ], settings: { chainingAllowed: true, heartbeatIntervalMillis: , heartbeatTimeoutSecs: , electionTimeoutMillis: , catchUpTimeoutMillis: , getLastErrorModes: {}, getLastErrorDefaults: { w: , wtimeout: }, replicaSetId: ObjectId('59082229517dd35bb9fd0d2a') } }
--02T17::11.797+ I REPL [replExecDBWorker-] This node is node3: in the config
--02T17::11.797+ I REPL [replExecDBWorker-] transition to STARTUP2
--02T17::11.797+ I REPL [replExecDBWorker-] Starting replication storage threads
--02T17::11.798+ I REPL [replExecDBWorker-] Starting replication fetcher thread
--02T17::11.798+ I REPL [replExecDBWorker-] Starting replication applier thread
--02T17::11.798+ I REPL [replExecDBWorker-] Starting replication reporter thread
--02T17::11.799+ I ASIO [NetworkInterfaceASIO-Replication-] Connecting to node3:
--02T17::11.799+ I ASIO [NetworkInterfaceASIO-Replication-] Connecting to node3:
--02T17::11.799+ I REPL [rsSync] transition to RECOVERING
--02T17::11.801+ I REPL [rsSync] transition to SECONDARY
--02T17::11.801+ I ASIO [NetworkInterfaceASIO-Replication-] Successfully connected to node3:
--02T17::11.801+ I ASIO [NetworkInterfaceASIO-Replication-] Successfully connected to node3:
--02T17::11.802+ I REPL [ReplicationExecutor] Member node3: is now in state ARBITER
--02T17::11.803+ I REPL [ReplicationExecutor] Member node3: is now in state PRIMARY
--02T17::12.116+ I FTDC [ftdc] Unclean full-time diagnostic data capture shutdown detected, found interim file, some metrics may have been lost. OK
--02T17::12.388+ I NETWORK [thread1] connection accepted from 192.168.244.30: # ( connection now open)
--02T17::12.390+ I NETWORK [conn1] received client metadata from 192.168.244.30: conn1: { driver: { name: "NetworkInterfaceASIO-Replication", version: "3.4.2" }, os: { type: "Linux", name: "Red Hat Enterprise Linux Server release 6.7 (Santiago)", architecture: "x86_64", version: "Kernel 2.6.32-573.el6.x86_64" } }
--02T17::15.744+ I NETWORK [thread1] connection accepted from 192.168.244.30: # ( connections now open)
--02T17::15.745+ I NETWORK [conn2] received client metadata from 192.168.244.30: conn2: { driver: { name: "NetworkInterfaceASIO-Replication", version: "3.4.2" }, os: { type: "Linux", name: "Red Hat Enterprise Linux Server release 6.7 (Santiago)", architecture: "x86_64", version: "Kernel 2.6.32-573.el6.x86_64" } }
--02T17::17.802+ I REPL [rsBackgroundSync] sync source candidate: node3:
--02T17::17.873+ I ASIO [NetworkInterfaceASIO-RS-] Connecting to node3:
--02T17::17.875+ I ASIO [NetworkInterfaceASIO-RS-] Successfully connected to node3:
--02T17::18.203+ I ASIO [NetworkInterfaceASIO-RS-] Connecting to node3:
--02T17::18.211+ I ASIO [NetworkInterfaceASIO-RS-] Successfully connected to node3:

参考

1. 《MongoDB实战》

2. 《MongoDB权威指南》

3.  官方文档

MongoDB副本集的搭建的更多相关文章

  1. [ MongoDB ] 副本集的搭建及测试

    Replica Sets  复制 (副本集) node1: 10.0.0.10node2: 10.0.0.11node3: 10.0.0.12 副本集结构图:

  2. window系统上实现mongodb副本集的搭建

    一.问题引出 假设我们生产上的mongodb是单实例在跑,如果此时发生网络发生问题或服务器上的硬盘发生了损坏,那么这个时候我们的mongodb就使用不了.此时我们就需要我们的mongodb实现高可用, ...

  3. mongodb副本集群搭建

    一.环境介绍 1.机器信息 10.40.6.68 10.40.6.108 10.40.6.110 软件环境为centos 6.x 2.mongodb 下载链接地址 https://www.mongod ...

  4. Docker下搭建mongodb副本集

    背景 有需求需要对mongodb做一个容灾备份.根据官网,发现mongodb最新版本(4.0)已经抛弃了主从模式而采用副本集进行容灾.副本集的优势在于:"有自动故障转移和恢复特性,其任意节点 ...

  5. MongoDB副本集的常用操作及原理

    本文是对MongoDB副本集常用操作的一个汇总,同时也穿插着介绍了操作背后的原理及注意点. 结合之前的文章:MongoDB副本集的搭建,大家可以在较短的时间内熟悉MongoDB的搭建和管理. 下面的操 ...

  6. 我们的一个已投产项目的高可用数据库实战 - mongo 副本集的搭建具体过程

    我们的 mongo 副本集有三台 mongo 服务器:一台主库两台从库. 主库进行写操作,两台从库进行读操作(至于某次读操作到底路由给了哪台,仲裁决定).实现了读写分离.这还不止,假设主库宕掉,还能实 ...

  7. MongoDB 副本集的常用操作及原理

    本文是对MongoDB副本集常用操作的一个汇总,同时也穿插着介绍了操作背后的原理及注意点. 结合之前的文章:MongoDB副本集的搭建,大家可以在较短的时间内熟悉MongoDB的搭建和管理. 下面的操 ...

  8. MongoDB 副本集的原理、搭建、应用

    概念: 在了解了这篇文章之后,可以进行该篇文章的说明和测试.MongoDB 副本集(Replica Set)是有自动故障恢复功能的主从集群,有一个Primary节点和一个或多个Secondary节点组 ...

  9. MongoDB副本集学习(一):概述和环境搭建

    MongoDB副本集概述 以下图片摘自MongoDB官方文档:http://docs.mongodb.org/manual/core/replication-introduction/ Primary ...

随机推荐

  1. canvas 3D雪花效果

    <!DOCTYPE html> <html style="height: 100%;"> <head> <meta charset=&qu ...

  2. TFS发布计划发送到钉钉消息群

    由于工作中需要用到钉钉,每天都要和钉钉打交道:上下班打卡.出差请假流程.各种工作讨论组,不一而足,工作已然和钉钉绑在了一起,难怪有广告词: 微信是一个生活方式,钉钉是一个工作方式. 我们是钉钉机器人内 ...

  3. 开源 & 免费使用 & 打包下载自行部署 :升讯威 周报系统

    这个周报系统大约写于2015年,缘起当时所带的开发团队需要逐步建立或完善一些项目管理方法. 在调研了网上的诸多项目管理或周报/日报管理系统之后,并没有找到符合当时情况的系统,这里最大的问题不是网上既有 ...

  4. java调试技能之dubbo调试 ---telnet

    dubbo作为一个远程调用框架,虽与同类型的框架,不知道谁优谁劣,但是就公司层面使用来说,还是很棒的.这里简单的写一下怎么使用和调试技巧,就算是作个使用总结吧,供快速使用和问题解决! dubbo是基于 ...

  5. hdu 2157 How many ways?? (可达矩阵)

    题意:给你一个有向图,从A 点到 B点恰好经过k个点的方案数 (k < 20), 可以走重复边 思路:利用离散数学中的可达矩阵,可达矩阵的K次幂便是从i到j走K步能到达的方案数 代码: #inc ...

  6. MySQL flashback 功能

    1. 简介 mysqlbinlog flashback(闪回)用于快速恢复由于误操作丢失的数据.在DBA误操作时,可以把数据库恢复到以前某个时间点(或者说某个binlog的某个pos).比如忘了带wh ...

  7. 这辈子只能碰到一次! 记一次SSL无故被撤消!

    SSL证书刚更新一切都那么正常, 突然有一天网站不能访问了, Chrome浏览器提示有风险, 没有继续访问链接,没有,没有, 重要的事情说三遍, 于是乎赶紧加班查原因, 发展浏览器报的错误是证书撤消( ...

  8. Bootstrap基础学习(一)—表格与按钮

    一.Bootstrap 概述      Bootstrap 是由 Twitter 公司(全球最大的微博)的两名技术工程师研发的一个基于HTML.CSS.JavaScript 的开源框架.该框架代码简洁 ...

  9. 基于Spring开发——自定义标签及其解析

    1. XML Schema 1.1 最简单的标签 一个最简单的标签,形式如: <bf:head-routing key="1" value="1" to= ...

  10. python select epoll poll的解析

    select.poll.epoll三者的区别 select select最早于1983年出现在4.2BSD中,它通过一个select()系统调用来监视多个文件描述符的数组(在linux中一切事物皆文件 ...