本文简要说明下orchestrator的Raft集群部署,其实部署很简单主要是好好研究下配置文件的配置,这里我的样例配置文件暂时只适用于我们这块业务

如果您自己使用请根据情况自行修改。

主要通过配置文件,守护进程、代理配置(官方推荐通过代理把请求都打在Leader上)三部分说明

机器信息:

192.168.1.100

192.168.1.101

192.168.1.102

一共三台机器,三台机器分别部署

1. 配置文件

{
"Debug": true,
"EnableSyslog": false,
"ListenAddress": ":3000",
"AgentsServerPort": ":3001",
"MySQLTopologyUser": "orc_client_user",
"MySQLTopologyPassword": "orc_client_user_password",
"MySQLTopologyCredentialsConfigFile": "",
"MySQLTopologySSLPrivateKeyFile": "",
"MySQLTopologySSLCertFile": "",
"MySQLTopologySSLCAFile": "",
"MySQLTopologySSLSkipVerify": true,
"MySQLTopologyUseMutualTLS": false,
# 这里我使用的是sqlite数据库,如果使用mysql数据库请自己修改
"BackendDB": "sqlite3",
"SQLite3DataFile": "/export/orc-sqlite3.db",
"MySQLOrchestratorHost": "192.168.1.100",
"MySQLOrchestratorPort": 3358,
"MySQLOrchestratorDatabase": "orchestrator",
"MySQLOrchestratorUser": "orchestrator_rw",
"MySQLOrchestratorPassword": "orchestrator_pwd",
"MySQLOrchestratorCredentialsConfigFile": "",
"MySQLOrchestratorSSLPrivateKeyFile": "",
"MySQLOrchestratorSSLCertFile": "",
"MySQLOrchestratorSSLCAFile": "",
"MySQLOrchestratorSSLSkipVerify": true,
"MySQLOrchestratorUseMutualTLS": false,
"MySQLConnectTimeoutSeconds": 5,
"DefaultInstancePort": 3306,
"SkipOrchestratorDatabaseUpdate": false,
"SlaveLagQuery": "",
"DiscoverByShowSlaveHosts": true,
"InstancePollSeconds": 30,
"UnseenInstanceForgetHours": 240,
"SnapshotTopologiesIntervalHours": 0,
"InstanceBulkOperationsWaitTimeoutSeconds": 10,
"HostnameResolveMethod": "none",
"MySQLHostnameResolveMethod": "none",
"SkipBinlogServerUnresolveCheck": true,
"ExpiryHostnameResolvesMinutes": 60,
"RejectHostnameResolvePattern": "",
"ReasonableReplicationLagSeconds": 10,
"ProblemIgnoreHostnameFilters": [],
"VerifyReplicationFilters": false,
"ReasonableMaintenanceReplicationLagSeconds": 20,
"CandidateInstanceExpireMinutes": 60,
"AuditLogFile": "/tmp/orchestrator-audit.log",
"AuditToSyslog": false,
"RemoveTextFromHostnameDisplay": ".mydomain.com:3358",
"ReadOnly": false,
"AuthenticationMethod": "",
"HTTPAuthUser": "",
"HTTPAuthPassword": "",
"AuthUserHeader": "",
"PowerAuthUsers": [
"*"
],
"ClusterNameToAlias": {
"127.0.0.1": "test suite"
},
"DetectClusterAliasQuery": "SELECT value FROM _vt.local_metadata WHERE name='ClusterAlias'",
"DetectClusterDomainQuery": "",
"DetectInstanceAliasQuery": "SELECT value FROM _vt.local_metadata WHERE name='Alias'",
"DetectPromotionRuleQuery": "SELECT value FROM _vt.local_metadata WHERE name='PromotionRule'",
"DataCenterPattern": "",
"PhysicalEnvironmentPattern": "[.]([^.]+[.][^.]+)[.]mydomain[.]com",
"DetectDataCenterQuery": "SELECT value FROM _vt.local_metadata where name='DataCenter'",
"PromotionIgnoreHostnameFilters": [],
"DetectSemiSyncEnforcedQuery": "SELECT @@global.rpl_semi_sync_master_wait_no_slave AND @@global.rpl_semi_sync_master_timeout > 1000000",
"ServeAgentsHttp": false,
"AgentsUseSSL": false,
"AgentsUseMutualTLS": false,
"AgentSSLSkipVerify": false,
"AgentSSLPrivateKeyFile": "",
"AgentSSLCertFile": "",
"AgentSSLCAFile": "",
"AgentSSLValidOUs": [],
"UseSSL": false,
"UseMutualTLS": false,
"SSLSkipVerify": false,
"SSLPrivateKeyFile": "",
"SSLCertFile": "",
"SSLCAFile": "",
"SSLValidOUs": [],
"StatusEndpoint": "/api/status",
"StatusSimpleHealth": true,
"StatusOUVerify": false,
"AgentPollMinutes": 60,
"UnseenAgentForgetHours": 6,
"StaleSeedFailMinutes": 60,
"SeedAcceptableBytesDiff": 8192,
"PseudoGTIDPattern": "drop view if exists .*?`_pseudo_gtid_hint__",
"PseudoGTIDMonotonicHint": "asc:",
"DetectPseudoGTIDQuery": "",
"BinlogEventsChunkSize": 10000,
"SkipBinlogEventsContaining": [],
"ReduceReplicationAnalysisCount": true,
"FailureDetectionPeriodBlockMinutes": 60,
"RecoveryPeriodBlockMinutes": 60,
"RecoveryPeriodBlockSeconds": 3600,
"RecoveryIgnoreHostnameFilters": [
".*"
],
"RecoverMasterClusterFilters": [
".*"
],
"RecoverIntermediateMasterClusterFilters": [
"_intermediate_master_pattern_"
],
"OnFailureDetectionProcesses": [
"echo 'Detected {failureType} on {failureCluster}. Affected replicas: {countSlaves}' >> /tmp/recovery.log"
],
"PreFailoverProcesses": [
"echo 'Will recover from {failureType} on {failureCluster}' >> /tmp/recovery.log"
],
"PostFailoverProcesses": [
"echo '(for all types) Recovered from {failureType} on {failureCluster}. Failed: {failedHost}:{failedPort}; Successor: {successorHost}:{successorPort}' >> /tmp/recovery.log"
],
"PostUnsuccessfulFailoverProcesses": [
"echo 'Recovered from {failureType} on {failureCluster}. Failed: {failedHost}:{failedPort}; Promoted: {successorHost}:{successorPort}' >> /tmp/recovery.log",
"curl -d '{\"isSuccessful\": {isSuccessful}, \"failureType\": \"{failureType}\", \"failureDescription\": \"{failureDescription}\", \"failedHost\": \"{failedHost}\", \"failedPort\": {failedPort}, \"failureCluster\": \"{failureCluster}\", \"failureClusterAlias\": \"{failureClusterAlias}\", \"failureClusterDomain\": \"{failureClusterDomain}\", \"countSlaves\": {countSlaves}, \"countReplicas\": {countReplicas}, \"isDowntimed\": {isDowntimed}, \"autoMasterRecovery\": {autoMasterRecovery}, \"autoIntermediateMasterRecovery\": {autoIntermediateMasterRecovery}, \"orchestratorHost\": \"{orchestratorHost}\", \"recoveryUID\": \"{recoveryUID}\", \"lostSlaves\": \"{lostSlaves}\", \"lostReplicas\": \"{lostReplicas}\", \"slaveHosts\": \"{slaveHosts}\", \"replicaHosts\": \"{replicaHosts}\"}' http://test.domain.com/api/failover"
],
"PostMasterFailoverProcesses": [
"echo 'Recovered from {failureType} on {failureCluster}. Failed: {failedHost}:{failedPort}; Promoted: {successorHost}:{successorPort}' >> /tmp/recovery.log",
"curl -d '{\"isSuccessful\": {isSuccessful}, \"failureType\": \"{failureType}\", \"failureDescription\": \"{failureDescription}\", \"failedHost\": \"{failedHost}\", \"failedPort\": {failedPort}, \"failureCluster\": \"{failureCluster}\", \"failureClusterAlias\": \"{failureClusterAlias}\", \"failureClusterDomain\": \"{failureClusterDomain}\", \"countSlaves\": {countSlaves}, \"countReplicas\": {countReplicas}, \"isDowntimed\": {isDowntimed}, \"autoMasterRecovery\": {autoMasterRecovery}, \"autoIntermediateMasterRecovery\": {autoIntermediateMasterRecovery}, \"orchestratorHost\": \"{orchestratorHost}\", \"recoveryUID\": \"{recoveryUID}\", \"successorHost\": \"{successorHost}\", \"successorPort\": {successorPort}, \"lostSlaves\": \"{lostSlaves}\", \"lostReplicas\": \"{lostReplicas}\", \"slaveHosts\": \"{slaveHosts}\", \"successorAlias\": \"{successorAlias}\",\"replicaHosts\": \"{replicaHosts}\"}' http://test.domain.com/api/failover"
],
"PostIntermediateMasterFailoverProcesses": [
"echo 'Recovered from {failureType} on {failureCluster}. Failed: {failedHost}:{failedPort}; Successor: {successorHost}:{successorPort}' >> /tmp/recovery.log"
],
"CoMasterRecoveryMustPromoteOtherCoMaster": true,
"DetachLostSlavesAfterMasterFailover": true,
"ApplyMySQLPromotionAfterMasterFailover": true,
"MasterFailoverLostInstancesDowntimeMinutes": 0,
"PostponeSlaveRecoveryOnLagMinutes": 0,
"OSCIgnoreHostnameFilters": [],
"GraphiteAddr": "",
"GraphitePath": "",
"GraphiteConvertHostnameDotsToUnderscores": true,
"RaftEnabled": true,
"RaftDataDir": "/export/Data/orchestrator",
"RaftBind": "192.168.1.100",
"RaftAdvertise": "192.168.1.100",
"DefaultRaftPort": 10008,
"RaftNodes": [
"192.168.1.100",
"192.168.1.101",
"192.168.1.102",
] }

以上只列出了192.168.1.100的配置,另外两台机器的配置类似的修改就可以,主要注意需要修改RaftBind, RaftAdvertise对应的值即可,其他参数根据个人需要自己修改。

配置准备好后将程序打包防止在/export/App/orchestrator目录下

2. sqlite数据库文件配置

由于配置里面我们使用了sqlite数据库,所以需要将sqlite数据库文件放置在对应的目录下

3. 守护进程

  参数配置文件准备好了,我们下面准备守护进程脚本,方便程序故障时重新启动

配置脚本信息如下:

#!/bin/bash
# orchestrator daemon
# chkconfig:
# description: orchestrator daemon
# processname: orchestrator # Script credit: http://werxltd.com/wp/2012/01/05/simple-init-d-script-template/
# 执行路径,这个和我们防止可执行文件的路劲一致
DAEMON_PATH="/export/App/orchestrator" DAEMON=orchestrator
DAEMONOPTS="-config /export/App/orchestrator/orchestrator.conf.json --verbose http" NAME=orchestrator
DESC="orchestrator: MySQL replication management and visualization"
PIDFILE=/var/run/$NAME.pid
SCRIPTNAME=/etc/init.d/$NAME # Limit the number of file descriptors (and sockets) used by
# orchestrator. This setting should be fine in most cases but a
# large busy environment may # reach this limit. If exceeded expect
# to see errors of the form:
# -- :: ERROR dial tcp 10.1.2.3:: connect: cannot assign requested address
# To avoid touching this script you can use /etc/orchestrator_profile
# to increase this limit.
ulimit -n # initially noop but can adjust according by modifying orchestrator_profile
# - see https://github.com/github/orchestrator/issues/227 for more details.
post_start_daemon_hook () {
# by default do nothing
:
} # Start the orchestrator daemon in the background
# 这里需要注意的是我们启动后会把日志文件直接重定向到/export/Logs/orchestrator目录下面,查看日志直接在该目录查看即可
start_daemon () {
# start up daemon in the background
$DAEMON_PATH/$DAEMON $DAEMONOPTS >> /export/Logs/orchestrator/${NAME}.log >& &
# collect and print PID of started process
echo $!
# space for optional processing after starting orchestrator
# - redirect stdout to stderro to prevent this corrupting the pid info
post_start_daemon_hook >&
} # The file /etc/orchestrator_profile can be used to inject pre-service execution
# scripts, such as exporting variables or whatever. It's yours!
#[ -f /etc/orchestrator/orchestrator_profile ] && . /etc/orchestrator/orchestrator_profile case "$1" in
start)
printf "%-50s" "Starting $NAME..."
cd $DAEMON_PATH
PID=$(start_daemon)
#echo "Saving PID" $PID " to " $PIDFILE
if [ -z $PID ]; then
printf "%s\n" "Fail"
exit
elif [ -z "$(ps axf | awk '{print $1}' | grep ${PID})" ]; then
printf "%s\n" "Fail"
exit
else
echo $PID > $PIDFILE
printf "%s\n" "Ok"
fi
;;
status)
printf "%-50s" "Checking $NAME..."
if [ -f $PIDFILE ]; then
PID=$(cat $PIDFILE)
if [ -z "$(ps axf | awk '{print $1}' | grep ${PID})" ]; then
printf "%s\n" "Process dead but pidfile exists"
exit
else
echo "Running"
fi
else
printf "%s\n" "Service not running"
exit
fi
;;
stop)
printf "%-50s" "Stopping $NAME"
PID=$(cat $PIDFILE)
cd $DAEMON_PATH
if [ -f $PIDFILE ]; then
kill -TERM $PID
rm -f $PIDFILE
# Wait for orchestrator to stop otherwise restart may fail.
# (The newly restarted process may be unable to bind to the
# currently bound socket.)
while ps -p $PID >/dev/null >&; do
printf "."
sleep
done
printf "\n"
printf "Ok\n"
else
printf "%s\n" "pidfile not found"
exit
fi
;;
restart)
$ stop
$ start
;;
reload)
PID=$(cat $PIDFILE)
cd $DAEMON_PATH
if [ -f $PIDFILE ]; then
kill -HUP $PID
printf "%s\n" "Ok"
else
printf "%s\n" "pidfile not found"
exit
fi
;;
*)
echo "Usage: $0 {status|start|stop|restart|reload}"
exit
esac

如果应用程序防止目录不一致,只需要对应的修改路径即可,其他信息不用修改

所有程序防止完成之后即可启动,看看是否能正常选举

4. haproxy代理

安装haproxy

yum install haproxy

配置文件

#---------------------------------------------------------------------
# Example configuration for a possible web application. See the
# full configuration options online.
#
# http://haproxy.1wt.eu/download/1.4/doc/configuration.txt
#
#--------------------------------------------------------------------- #---------------------------------------------------------------------
# Global settings
#---------------------------------------------------------------------
global
# to have these messages end up in /var/log/haproxy.log you will
# need to:
#
# ) configure syslog to accept network log events. This is done
# by adding the '-r' option to the SYSLOGD_OPTIONS in
# /etc/sysconfig/syslog
#
# ) configure local2 events to go to the /var/log/haproxy.log
# file. A line like the following can be added to
# /etc/sysconfig/syslog
#
# local2.* /var/log/haproxy.log
#
log 127.0.0.1 local2 chroot /var/lib/haproxy
pidfile /var/run/haproxy.pid
maxconn
user haproxy
group haproxy
daemon # turn on stats unix socket
stats socket /var/lib/haproxy/stats #---------------------------------------------------------------------
# common defaults that all the 'listen' and 'backend' sections will
# use if not designated in their block
#---------------------------------------------------------------------
defaults
mode http
log global
option httplog
option dontlognull
option http-server-close
option forwardfor except 127.0.0.0/
option redispatch
retries
timeout http-request 10s
timeout queue 1m
timeout connect 10s
timeout client 1m
timeout server 1m
timeout http-keep-alive 10s
timeout check 10s
maxconn #---------------------------------------------------------------------
# main frontend which proxys to the backends
#---------------------------------------------------------------------
#frontend main *:
# acl url_static path_beg -i /static /images /javascript /stylesheets
# acl url_static path_end -i .jpg .gif .png .css .js
#
# use_backend static if url_static
# default_backend app #---------------------------------------------------------------------
# static backend for serving up images, stylesheets and such
#---------------------------------------------------------------------
#backend static
# balance roundrobin
# server static 127.0.0.1: check #---------------------------------------------------------------------
# round robin balancing between the various backends
#---------------------------------------------------------------------
#backend app
# balance roundrobin
# server app1 127.0.0.1: check
# server app2 127.0.0.1: check
# server app3 127.0.0.1: check
# server app4 127.0.0.1: check listen orchestrator
stats enable
bind *:
mode http
stats refresh 30s
stats uri /admin
bind 0.0.0.0: process
bind 0.0.0.0: process
bind 0.0.0.0: process
bind 0.0.0.0: process option httpchk GET /api/leader-check
maxconn
balance first
retries
timeout connect
timeout check
timeout server 30s
timeout client 30s default-server port fall inter rise downinter on-marked-down shutdown-sessions weight
server 192.168.1.100 192.168.1.100: check
server 192.168.1.101 192.168.1.101: check
server 192.168.1.102 192.168.1.102: check

重启haproxy服务

service haproxy status

启动后查看监控页面

orchestrator-Raft集群部署的更多相关文章

  1. 003.etcd集群部署-静态发现

    一 etcd集群概述 1.1 概述 静态启动etcd集群要求每个成员都知道集群中的另一个成员.Etcd运行在集群的每个coreos节点上,可以保证coreos集群的稳定,可靠的运行.当集群网络出现动荡 ...

  2. 集群容器管理之swarm ---集群部署

    集群部署及节点管理 使用swarm前提: Docker版本1.12+ 集群节点之间保证TCP 2377.TCP/UDP 7946和UDP 4789端口通信 节点规划: 操作系统:centos7.4.1 ...

  3. centos6下redis cluster集群部署过程

    一般来说,redis主从和mysql主从目的差不多,但redis主从配置很简单,主要在从节点配置文件指定主节点ip和端口,比如:slaveof 192.168.10.10 6379,然后启动主从,主从 ...

  4. etcd集群部署与遇到的坑(转)

    原文 https://www.cnblogs.com/breg/p/5728237.html etcd集群部署与遇到的坑 在k8s集群中使用了etcd作为数据中心,在实际操作中遇到了一些坑.今天记录一 ...

  5. k8s集群部署之环境介绍与etcd数据库集群部署

    角色 IP 组件 配置 master-1 192.168.10.11 kube-apiserver kube-controller-manager kube-scheduler etcd 2c 2g ...

  6. ElasticSearch 深入理解 三:集群部署设计

    ElasticSearch 深入理解 三:集群部署设计 ElasticSearch从名字中也可以知道,它的Elastic跟Search是同等重要的,甚至以Elastic为主要导向. Elastic即可 ...

  7. consul异地多数据中心以及集群部署方案

    consul异地多数据中心以及集群部署方案目的实现consul 异地多数据中心环境部署,使得一个数据中心的服务可以从另一个数据中心的consul获取已注册的服务地址 环境准备两台 linux服务器,外 ...

  8. influxdb集群部署

    环境准备 influxdb enterprise运行条件最低需要三个meta nodes节点以及两个data nodes Meta nodes之间使用TCP和Raft一致性协议通信,默认端口为8089 ...

  9. (1)Consul在linux环境的集群部署

    1.Consul概念 1.1什么是Consul? Consul是一种服务网格解决方案,是HashiCorp公司推出的开源组件,由Go语言开发,部署起来很容易,只需要极少的可执行程序和配置.同时Cons ...

随机推荐

  1. 顶部有一排按钮,最底下还有FooterView的ListView页面

    Android 先上效果图: 下面详细说说这个页面是怎么做出来的: 1.这个页面最下方可以看到一个TAB页签,分别是“主页”.“提及”等等,这个是一个在底部的TAB分页样式,在上一篇博客中已经介绍了 ...

  2. VS创建、安装、调试 windows服务(windows service)

    1.创建 windows服务 项目   文件 -> 新建项目 -> 已安装的模板 -> Visual C# -> windows ,在右侧窗口选择"windows 服 ...

  3. django学习笔记【004】创建带有model的app

    第一步:配置连接字符串 DATABASES = { 'default': { 'ENGINE': 'django.db.backends.mysql', 'NAME': 'tempdb', 'HOST ...

  4. unity, List namespace

    如果要使用List,需要using System.Collections.Generic;

  5. Daemon,Jos,定时器

    --> FileSystemWatcher--> EventWaitHandle / AutoResetEvent / ManualResetEvent--> Mutex--> ...

  6. 自己定义进度条PictureProgressBar——从开发到开源公布全过程

    自己定义进度条PictureProgressBar--从开发到开源公布全过程 出处: 炎之铠邮箱:yanzhikai_yjk@qq.com 本文原创.转载请注明本出处! 本项目JCenter地址:ht ...

  7. ubuntu设置自动休眠

    ubuntu16.04默认是永不休眠的,有时候忘了关机,那就惨了,一会用到没电为止. 设置方法: 进入“系统设置”->"安全与隐私"->"电源设置" ...

  8. atitit。企业组织与软件工程的策略 战略 趋势 原则 attilax 大总结

    atitit.企业组织与软件工程的策略 战略 趋势 原则 attilax 大总结 1. 战略规划,适当的过度设计 1 2. 跨平台化 1 3. 可扩展性高于一切 1 4. 界面html5化 2 5.  ...

  9. Mysql bin-log日志文件处理

    当MySQL开启bin-log选项后,会不停的记录bin-log,但是几天前的日志肯定就没用了或者可以备份到别处,那么如何正确的处理这些日志呢 参考一下几篇文章 MySQL mysql-bin log ...

  10. HTML中让表单input等文本框为只读不可编辑但可以获取value值的方法;让文本域前面的内容显示在左上角,居中

      HTML中让表单input等文本框为只读不可编辑的方法 有时候,我们希望表单中的文本框是只读的,让用户不能修改其中的信息,如使input text的内容,中国两个字不可以修改   有时候,我们希望 ...