cinder服务状态up/down的源码梳理

基于ocata版本的，源码梳理

1)用户输入cinder service-list命令行，查看cinder服务的状态时，cinder的入口函数为cinder/api/contrib/services.py:Service:index方法

class ServiceController(wsgi.Controller):

    def __init__(self, ext_mgr=None):

        self.ext_mgr = ext_mgr

        super(ServiceController, self).__init__()

        self.volume_api = volume.API()

    def index(self, req):

        """Return a list of all running services.

        Filter by host & service name.

        """

        context = req.environ['cinder.context']

        authorize(context, action='index')

        detailed = self.ext_mgr.is_loaded('os-extended-services')

        now = timeutils.utcnow(with_timezone=True)------------------//获取controller 当前的时间

        filters = {}

        if 'host' in req.GET:

            filters['host'] = req.GET['host']

        if 'binary' in req.GET:

            filters['binary'] = req.GET['binary']

        elif 'service' in req.GET:

            filters['binary'] = req.GET['service']

            versionutils.report_deprecated_feature(LOG, _(

                "Query by service parameter is deprecated. "

                "Please use binary parameter instead."))

        services = objects.ServiceList.get_all(context, filters)----------//从 db 获取所有的 cinder service 列表

        svcs = []

        for svc in services:----------------------------//循环每个 service

            updated_at = svc.updated_at

            delta = now - (svc.updated_at or svc.created_at)-------------//获取 updated_at。不存在的话，获取 created_at，并和当前时间计算时间差

            delta_sec = delta.total_seconds()

            if svc.modified_at:

                delta_mod = now - svc.modified_at

                if abs(delta_sec) >= abs(delta_mod.total_seconds()):

                    updated_at = svc.modified_at

            alive = abs(delta_sec) <= CONF.service_down_time------/获取时间差值的绝对值，并检查是否小于配置的 server_down_time，该配置项默认是60秒

            art = (alive and "up") or "down"----------------------//如果差值小于60，则service 状态为 up，否则为 down

            active = 'enabled'

            if svc.disabled:

                active = 'disabled'

            if updated_at:

                updated_at = timeutils.normalize_time(updated_at)

            ret_fields = {'binary': svc.binary, 'host': svc.host,

                          'zone': svc.availability_zone,

                          'status': active, 'state': art,

                          'updated_at': updated_at}

            # On V3.7 we added cluster support

            if req.api_version_request.matches('3.7'):

                ret_fields['cluster'] = svc.cluster_name

            if detailed:

                ret_fields['disabled_reason'] = svc.disabled_reason

                if svc.binary == "cinder-volume":

                    ret_fields['replication_status'] = svc.replication_status

                    ret_fields['active_backend_id'] = svc.active_backend_id

                    ret_fields['frozen'] = svc.frozen

            svcs.append(ret_fields)

        return {'services': svcs}

因此 service 的 up/down 状态取决于数据库中 service 表对应某 service 的行的 updated_at 列的值和当前 controller 节点的时间的差值是否在配置的范围之内，如果差值在设置的范围之内，那么就认为服务是up的，如果差值不在设置的范围之内，那么就认为服务时down的，那么每个服务的updated_at的值从如何更新的？

2、cinder各个服务对应数据库中update_at值的更新，这个字段的时间值，获取的是该服务运行在哪个物理节点上，就获取当前物理节点的时间值，更新到数据库值，计数器加1

cinder 的各种service，比如cinder-api，cinder-backup 等，都是cinder/service.py 文件中 class Service(service.Service) 的一个实例
(这个实例，采用一个manager,使能rpc，通过监听基于topic的队列，同时他还定期在manager上运行一个任务，上报他的状态给数据库服务表)
该类的start方法如下：

   def start(self):

        version_string = version.version_string()

        LOG.info(_LI('Starting %(topic)s node (version %(version_string)s)'),

                 {'topic': self.topic, 'version_string': version_string})

        self.model_disconnected = False

        if self.coordination:

            coordination.COORDINATOR.start()

        self.manager.init_host(added_to_cluster=self.added_to_cluster,--------调用的是manager模块中init_host方法，

                               service_id=Service.service_id)------------------这个方法的实现中依次调用driver.do_setup,driver.check_for_setup_error,driver.init_capabilities 三个函数，而函数init_capabilities中有会调动驱动的get_volume_stats函数来获取存储后端的存储状态信息

        LOG.debug("Creating RPC server for service %s", self.topic)

        ctxt = context.get_admin_context()

        endpoints = [self.manager]

        endpoints.extend(self.manager.additional_endpoints)

        obj_version_cap = objects.Service.get_minimum_obj_version(ctxt)

        LOG.debug("Pinning object versions for RPC server serializer to %s",

                  obj_version_cap)

        serializer = objects_base.CinderObjectSerializer(obj_version_cap)

        target = messaging.Target(topic=self.topic, server=self.host)

        self.rpcserver = rpc.get_server(target, endpoints, serializer)

        self.rpcserver.start()

        # NOTE(dulek): Kids, don't do that at home. We're relying here on

        # oslo.messaging implementation details to keep backward compatibility

        # with pre-Ocata services. This will not matter once we drop

        # compatibility with them.

        if self.topic == constants.VOLUME_TOPIC:

            target = messaging.Target(

                topic='%(topic)s.%(host)s' % {'topic': self.topic,

                                              'host': self.host},

                server=vol_utils.extract_host(self.host, 'host'))

            self.backend_rpcserver = rpc.get_server(target, endpoints,

                                                    serializer)

            self.backend_rpcserver.start()

        # TODO(geguileo): In O - Remove the is_svc_upgrading_to_n part

        if self.cluster and not self.is_svc_upgrading_to_n(self.binary):

            LOG.info(_LI('Starting %(topic)s cluster %(cluster)s (version '

                         '%(version)s)'),

                     {'topic': self.topic, 'version': version_string,

                      'cluster': self.cluster})

            target = messaging.Target(

                topic='%s.%s' % (self.topic, self.cluster),

                server=vol_utils.extract_host(self.cluster, 'host'))

            serializer = objects_base.CinderObjectSerializer(obj_version_cap)

            self.cluster_rpcserver = rpc.get_server(target, endpoints,

                                                    serializer)

            self.cluster_rpcserver.start()

        self.manager.init_host_with_rpc()

        if self.report_interval:-------------------------//如果设置了 report_interval 配置项，那么该 service 将启动一个无限循环来执行 report_state 方法，

		运行间隔就是 report_interval，其默认值是 10 秒，即默认10上报一次状态

            pulse = loopingcall.FixedIntervalLoopingCall(----------这是一个循环，self.report_state这个方法就是要循环执行的任务

                self.report_state)

            pulse.start(interval=self.report_interval,-----------开始这个运行这个循环，s1

                        initial_delay=self.report_interval)

            self.timers.append(pulse)

        if self.periodic_interval:

            if self.periodic_fuzzy_delay:

                initial_delay = random.randint(0, self.periodic_fuzzy_delay)

            else:

                initial_delay = None

            periodic = loopingcall.FixedIntervalLoopingCall(

                self.periodic_tasks)

            periodic.start(interval=self.periodic_interval,

                           initial_delay=initial_delay)

            self.timers.append(periodic)

s1方法的实现

    def report_state(self):----更新服务的状态到数据库中

        """Update the state of this service in the datastore."""

        if not self.manager.is_working():

            # NOTE(dulek): If manager reports a problem we're not sending

            # heartbeats - to indicate that service is actually down.

            LOG.error(_LE('Manager for service %(binary)s %(host)s is '

                          'reporting problems, not sending heartbeat. '

                          'Service will appear "down".'),

                      {'binary': self.binary,

                       'host': self.host})

            return

        ctxt = context.get_admin_context()

        zone = CONF.storage_availability_zone

        try:

            try:

                service_ref = objects.Service.get_by_id(ctxt,Service.service_id)-----根据service_id从数据库中获取service信息

            except exception.NotFound:

                LOG.debug('The service database object disappeared, '

                          'recreating it.')

                self._create_service_ref(ctxt)

                service_ref = objects.Service.get_by_id(ctxt,Service.service_id)

            service_ref.report_count += 1--------------更新报告计数器，加1

            if zone != service_ref.availability_zone:

                service_ref.availability_zone = zone

            service_ref.save()

            # TODO(termie): make this pattern be more elegant.

            if getattr(self, 'model_disconnected', False):

                self.model_disconnected = False

                LOG.error(_LE('Recovered model server connection!'))

3、services表字段的内容

mysql> desc services;

+------------------------+--------------+------+-----+---------+----------------+

| Field                  | Type         | Null | Key | Default | Extra          |

+------------------------+--------------+------+-----+---------+----------------+

| created_at             | datetime     | YES  |     | NULL    |                |

| updated_at             | datetime     | YES  |     | NULL    |                |

| deleted_at             | datetime     | YES  |     | NULL    |                |

| deleted                | tinyint(1)   | YES  |     | NULL    |                |

| id                     | int(11)      | NO   | PRI | NULL    | auto_increment |

| host                   | varchar(255) | YES  |     | NULL    |                |

| binary                 | varchar(255) | YES  |     | NULL    |                |

| topic                  | varchar(255) | YES  |     | NULL    |                |

| report_count           | int(11)      | NO   |     | NULL    |                |

| disabled               | tinyint(1)   | YES  |     | NULL    |                |

| availability_zone      | varchar(255) | YES  |     | NULL    |                |

| disabled_reason        | varchar(255) | YES  |     | NULL    |                |

| modified_at            | datetime     | YES  |     | NULL    |                |

| rpc_current_version    | varchar(36)  | YES  |     | NULL    |                |

| object_current_version | varchar(36)  | YES  |     | NULL    |                |

| replication_status     | varchar(36)  | YES  |     | NULL    |                |

| frozen                 | tinyint(1)   | YES  |     | NULL    |                |

| active_backend_id      | varchar(255) | YES  |     | NULL    |                |

| cluster_name           | varchar(255) | YES  |     | NULL    |                |

+------------------------+--------------+------+-----+---------+----------------+

19 rows in set (0.01 sec)

样例

mysql> select * from services limit 2\G;

*************************** 1. row ***************************

            created_at: 2018-08-16 07:29:20

            updated_at: 2019-06-14 09:22:23

            deleted_at: NULL

               deleted: 0

                    id: 1

                  host: 10.24.1.9

                binary: cinder-scheduler

                 topic: cinder-scheduler

          report_count: 838433

              disabled: 0

     availability_zone: nova

       disabled_reason: NULL

           modified_at: NULL

   rpc_current_version: 3.5

object_current_version: 1.21

    replication_status: not-capable

                frozen: 0

     active_backend_id: NULL

          cluster_name: NULL

*************************** 2. row ***************************

cinder服务状态up/down的源码梳理的更多相关文章

SpringCloud微服务如何优雅停机及源码分析
目录方式一:kill -9 java进程id[不建议] 方式二:kill -15 java进程id 或直接使用/shutdown 端点[不建议] kill 与/shutdown 的含义 Sprin ...
cesium结合geoserver利用WFS服务实现图层编辑(附源码下载)
前言 cesium 官网的api文档介绍地址cesium官网api,里面详细的介绍 cesium 各个类的介绍,还有就是在线例子:cesium 官网在线例子,这个也是学习 cesium 的好素材. 内 ...
cesium结合geoserver利用WFS服务实现图层删除(附源码下载)
前言 cesium 官网的api文档介绍地址cesium官网api,里面详细的介绍 cesium 各个类的介绍,还有就是在线例子:cesium 官网在线例子,这个也是学习 cesium 的好素材. 内 ...
[阿里DIN] 从模型源码梳理TensorFlow的乘法相关概念
[阿里DIN] 从模型源码梳理TensorFlow的乘法相关概念目录 [阿里DIN] 从模型源码梳理TensorFlow的乘法相关概念 0x00 摘要 0x01 矩阵乘积 1.1 matmul pr ...
[阿里DIN]从模型源码梳理TensorFlow的形状相关操作
[阿里DIN]从模型源码梳理TensorFlow的形状相关操作目录 [阿里DIN]从模型源码梳理TensorFlow的形状相关操作 0x00 摘要 0x01 reduce_sum 1.1 reduc ...
[Linux]服务管理：rpm包, 源码包
--------------------------------------------------------------------------------------------------- ...
客户端与服务端的事件watcher源码阅读
watcher存在的必要性举个特容易懂的例子: 假如我的项目是基于dubbo+zookeeper搭建的分布式项目, 我有三个功能相同的服务提供者,用zookeeper当成注册中心,我的三个项目得注册 ...
短链接服务Octopus的实现与源码开放
前提半年前(2020-06)左右,疫情触底反弹,公司的业务量不断提升,运营部门为了方便短信.模板消息推送等渠道的投放,提出了一个把长链接压缩为短链接的功能需求.当时为了快速推广,使用了一些比较知名的 ...
微服务架构 | *3.5 Nacos 服务注册与发现的源码分析
目录前言 1. 客户端注册进 Nacos 注册中心(客户端视角) 1.1 Spring Cloud 提供的规范标准 1.2 Nacos 的自动配置类 1.3 监听服务初始化事件 AbstractAu ...

随机推荐

codevs1279 Guard 的无聊
题目描述 Description 在那楼梯那边数实里面,有一只 guard,他活泼又聪明,他卖萌又霸气.他每天刷题虐场 D 人考上了 PKU,如果无聊就去数一数质数~~ 有一天 guard 在纸上写 ...
C++指向函数的指针
直接上代码: #include<iostream> #include<string> #include<vector> using namespace std; t ...
thinkphp3.2新部署是错
下载好thinkphp3.2,使用M或者D方法是,报FILE: tp\ThinkPHP\Library\Think\Db.class.php LINE: 42 可能的错误是,配置文件中没有配置数据库连 ...
delphi数据库的备份及还原
实例应用1: //备份procedure TF_DataBaseBackUp.Btn_bfClick(Sender: TObject); var i:integer; begin if SaveDia ...
2017-2018-1 20179215《Linux内核原理与分析》第四周作业
本次的实验是使用gdb跟踪调试内核从start_kernel到init进程启动,并分析启动的过程. 1.首先是在实验楼虚拟机上进行调试跟踪的过程. cd LinuxKernel qemu -kerne ...
BZOJ2563阿狸和桃子的游戏
2563: 阿狸和桃子的游戏 Time Limit: 3 Sec Memory Limit: 128 MBSubmit: 952 Solved: 682[Submit][Status][Discu ...
Android App在Google App Store中搜不到
情景:Android App在Google App Store上架成功,三星手机可以在Google App Store中搜索到,但是三星tablet却无法在Google App Store中搜索到,目 ...
JS图表工具 ---- Highcharts
Highcharts 是一个用纯 JavaScript编写的一个图表库, 能够很简单便捷的在web网站或是 web 应用程序添加有交互性的图表,并且免费提供给个人学习.个人网站和非商业用途使用. Hi ...
记一次肉机事件--yam
背景: 研发同事反应他自己的测试机器,有一个yum程序占用cpu很多,接近100%,然后他就将这个程序kill了.我一看他给我发的截图,原来不是“yum”,而是“yam”,第一反应就是让人当肉机了.上 ...
Erlang generic standard behaviours -- summary
gen_server 相关的片段分析得也差不多了, 这篇作为一个简要的总结.这一系列相关的分析暂且告一段落(之后如有必要,还会回来的 ^^ ),下一个系列主要是以pool 相关, 包括但不仅限于开源项 ...

cinder服务状态up/down的源码梳理

cinder服务状态up/down的源码梳理的更多相关文章

随机推荐

热门专题