OpenStack之虚机热迁移代码解析

  话说虚机迁移分为冷迁移以及热迁移,所谓热迁移用度娘的话说即是:热迁移(Live Migration,又叫动态迁移、实时迁移),即虚机保存/恢复(Save/Restore):将整个虚拟机的运行状态完整保存下来,同时可以快速的恢复到原有硬件平台甚至是不同硬件平台上。恢复以后,虚机仍旧平滑运行,用户不会察觉到任何差异。OpenStack的虚机迁移是基于Libvirt实现的,下面来看看Openstack虚机热迁移的具体代码实现。

  首先,由API入口进入到nova/api/openstack/compute/contrib/admin_actions.py

  @wsgi.action('os-migrateLive')
def _migrate_live(self, req, id, body):
"""Permit admins to (live) migrate a server to a new host."""
context = req.environ["nova.context"]
authorize(context, 'migrateLive') try:
block_migration = body["os-migrateLive"]["block_migration"]
disk_over_commit = body["os-migrateLive"]["disk_over_commit"]
host = body["os-migrateLive"]["host"]
except (TypeError, KeyError):
msg = _("host, block_migration and disk_over_commit must "
"be specified for live migration.")
raise exc.HTTPBadRequest(explanation=msg) try:
block_migration = strutils.bool_from_string(block_migration,
strict=True)
disk_over_commit = strutils.bool_from_string(disk_over_commit,
strict=True)
except ValueError as err:
raise exc.HTTPBadRequest(explanation=str(err)) try:
instance = self.compute_api.get(context, id, want_objects=True)
self.compute_api.live_migrate(context, instance, block_migration,
disk_over_commit, host)
except (exception.ComputeServiceUnavailable,
exception.InvalidHypervisorType,
exception.UnableToMigrateToSelf,
exception.DestinationHypervisorTooOld,
exception.NoValidHost,
exception.InvalidLocalStorage,
exception.InvalidSharedStorage,
exception.MigrationPreCheckError) as ex:
raise exc.HTTPBadRequest(explanation=ex.format_message())
except exception.InstanceNotFound as e:
raise exc.HTTPNotFound(explanation=e.format_message())
except exception.InstanceInvalidState as state_error:
common.raise_http_conflict_for_instance_invalid_state(state_error,
'os-migrateLive')
except Exception:
if host is None:
msg = _("Live migration of instance %s to another host "
"failed") % id
else:
msg = _("Live migration of instance %(id)s to host %(host)s "
"failed") % {'id': id, 'host': host}
LOG.exception(msg)
# Return messages from scheduler
raise exc.HTTPBadRequest(explanation=msg) return webob.Response(status_int=202)

    这里第一行可以看到是与API文档的第二行照应的:

 {
"os-migrateLive": {
"host": "0443e9a1254044d8b99f35eace132080",
"block_migration": false,
"disk_over_commit": false
}
}

  好了,源码中其实执行迁移工作的就是第26、27行的一条语句:

 self.compute_api.live_migrate(context, instance, block_migration,
disk_over_commit, host)

  由这句进入到nova/compute/api.py中,源码如下:

     @check_instance_cell
@check_instance_state(vm_state=[vm_states.ACTIVE])
def live_migrate(self, context, instance, block_migration,
disk_over_commit, host_name):
"""Migrate a server lively to a new host."""
LOG.debug(_("Going to try to live migrate instance to %s"),
host_name or "another host", instance=instance) instance.task_state = task_states.MIGRATING
instance.save(expected_task_state=[None]) self.compute_task_api.live_migrate_instance(context, instance,
host_name, block_migration=block_migration,
disk_over_commit=disk_over_commit)

  第2行是一个装饰器,用于在进入API方法之前,检测虚拟机和/或任务的状态, 如果实例处于错误的状态,将会引发异常;接下来实时迁移虚机到新的主机,并将虚机状态置于“migrating”,然后由12行进入nova/conductor/api.py

 def live_migrate_instance(self, context, instance, host_name,
block_migration, disk_over_commit):
scheduler_hint = {'host': host_name}
self._manager.migrate_server(
context, instance, scheduler_hint, True, False, None,
block_migration, disk_over_commit, None)

  将主机名存入字典scheduler_hint中,然后调用nova/conductor/manager.py方法migrate_server,

 def migrate_server(self, context, instance, scheduler_hint, live, rebuild,
flavor, block_migration, disk_over_commit, reservations=None):
if instance and not isinstance(instance, instance_obj.Instance):
# NOTE(danms): Until v2 of the RPC API, we need to tolerate
# old-world instance objects here
attrs = ['metadata', 'system_metadata', 'info_cache',
'security_groups']
instance = instance_obj.Instance._from_db_object(
context, instance_obj.Instance(), instance,
expected_attrs=attrs)
if live and not rebuild and not flavor:
self._live_migrate(context, instance, scheduler_hint,
block_migration, disk_over_commit)
elif not live and not rebuild and flavor:
instance_uuid = instance['uuid']
with compute_utils.EventReporter(context, self.db,
'cold_migrate', instance_uuid):
self._cold_migrate(context, instance, flavor,
scheduler_hint['filter_properties'],
reservations)
else:
raise NotImplementedError()

  由于在nova/conductor/api.py中传过来的参数是

 self._manager.migrate_server(
context, instance, scheduler_hint, True, False, None,
block_migration, disk_over_commit, None)

  因此live是True,rebuild是Flase,flavor是None,执行第12、13行代码:

 if live and not rebuild and not flavor:
self._live_migrate(context, instance, scheduler_hint,
block_migration, disk_over_commit)

  _live_migrate代码如下:

  def _live_migrate(self, context, instance, scheduler_hint,
block_migration, disk_over_commit):
destination = scheduler_hint.get("host")
try:
live_migrate.execute(context, instance, destination,
block_migration, disk_over_commit)
except (exception.NoValidHost,
exception.ComputeServiceUnavailable,
exception.InvalidHypervisorType,
exception.InvalidCPUInfo,
exception.UnableToMigrateToSelf,
exception.DestinationHypervisorTooOld,
exception.InvalidLocalStorage,
exception.InvalidSharedStorage,
exception.HypervisorUnavailable,
exception.MigrationPreCheckError) as ex:
with excutils.save_and_reraise_exception():
#TODO(johngarbutt) - eventually need instance actions here
request_spec = {'instance_properties': {
'uuid': instance['uuid'], },
}
scheduler_utils.set_vm_state_and_notify(context,
'compute_task', 'migrate_server',
dict(vm_state=instance['vm_state'],
task_state=None,
expected_task_state=task_states.MIGRATING,),
ex, request_spec, self.db)
except Exception as ex:
LOG.error(_('Migration of instance %(instance_id)s to host'
' %(dest)s unexpectedly failed.'),
{'instance_id': instance['uuid'], 'dest': destination},
exc_info=True)
raise exception.MigrationError(reason=ex)

  首先,第三行中将主机名赋给destination,然后执行迁移,后面的都是异常的捕捉,执行迁移的代码分为两部分,先看第一部分,在nova/conductor/tasks/live_migrate.py的184行左右:

 def execute(context, instance, destination,
block_migration, disk_over_commit):
task = LiveMigrationTask(context, instance,
destination,
block_migration,
disk_over_commit)
#TODO(johngarbutt) create a superclass that contains a safe_execute call
return task.execute()

  先创建包含安全执行回调的超类,然后返回如下函数也即执行迁移的第二部分代码,在54行左右:

 def execute(self):
self._check_instance_is_running()
self._check_host_is_up(self.source) if not self.destination:
self.destination = self._find_destination()
else:
self._check_requested_destination() #TODO(johngarbutt) need to move complexity out of compute manager
return self.compute_rpcapi.live_migration(self.context,
host=self.source,
instance=self.instance,
dest=self.destination,
block_migration=self.block_migration,
migrate_data=self.migrate_data)
#TODO(johngarbutt) disk_over_commit?

  这里有三部分内容:

  1. 如果目前主机不存在,则由调度算法选取一个目标主机,并且进行相关的检测,确保能够进行实时迁移操作;
  2. 如果目标主机存在,则直接进行相关的检测操作,确保能够进行实时迁移操作;
  3. 执行迁移操作。

  前两部分不再赘述,直接看第三部分代码,在nova/compute/rpcapi.py中:

 def live_migration(self, ctxt, instance, dest, block_migration, host,
migrate_data=None):
# NOTE(russellb) Havana compat
version = self._get_compat_version('3.0', '2.0')
instance_p = jsonutils.to_primitive(instance)
cctxt = self.client.prepare(server=host, version=version)
cctxt.cast(ctxt, 'live_migration', instance=instance_p,
dest=dest, block_migration=block_migration,
migrate_data=migrate_data)

  热迁移开始执行:

  def live_migration(self, context, instance, dest,
post_method, recover_method, block_migration=False,
migrate_data=None):
"""Spawning live_migration operation for distributing high-load. :param context: security context
:param instance:
nova.db.sqlalchemy.models.Instance object
instance object that is migrated.
:param dest: destination host
:param post_method:
post operation method.
expected nova.compute.manager.post_live_migration.
:param recover_method:
recovery method when any exception occurs.
expected nova.compute.manager.recover_live_migration.
:param block_migration: if true, do block migration.
:param migrate_data: implementation specific params """ greenthread.spawn(self._live_migration, context, instance, dest,
post_method, recover_method, block_migration,
migrate_data)

  这个方法中建立一个绿色线程来运行方法_live_migration,来执行实时迁移; 主要是调用libvirt python接口方法virDomainMigrateToURI,来实现从当前主机迁移domain对象到给定的目标主机;

  spawn:建立一个绿色线程来运行方法“func(*args, **kwargs)”,这里就是来运行方法_live_migration;

   _live_migration:执行实时迁移; 主要是调用libvirt python接口方法virDomainMigrateToURI,来实现从当前主机迁移domain对象到给定的目标主机;

   接着在绿色线程中调用_live_migration方法:

 def _live_migration(self, context, instance, dest, post_method,
recover_method, block_migration=False,
migrate_data=None):
"""Do live migration. :param context: security context
:param instance:
nova.db.sqlalchemy.models.Instance object
instance object that is migrated.
:param dest: destination host
:param post_method:
post operation method.
expected nova.compute.manager.post_live_migration.
:param recover_method:
recovery method when any exception occurs.
expected nova.compute.manager.recover_live_migration.
:param block_migration: if true, do block migration.
:param migrate_data: implementation specific params
""" # Do live migration.
try:
if block_migration:
flaglist = CONF.libvirt.block_migration_flag.split(',')
else:
flaglist = CONF.libvirt.live_migration_flag.split(',')
flagvals = [getattr(libvirt, x.strip()) for x in flaglist]
logical_sum = reduce(lambda x, y: x | y, flagvals) dom = self._lookup_by_name(instance["name"])
dom.migrateToURI(CONF.libvirt.live_migration_uri % dest,
logical_sum,
None,
CONF.libvirt.live_migration_bandwidth) except Exception as e:
with excutils.save_and_reraise_exception():
LOG.error(_("Live Migration failure: %s"), e,
instance=instance)
recover_method(context, instance, dest, block_migration) # Waiting for completion of live_migration.
timer = loopingcall.FixedIntervalLoopingCall(f=None)
 if block_migration:
flaglist = CONF.libvirt.block_migration_flag.split(',')

  这个获取块迁移标志列表,block_migration_flag:这个参数定义了为块迁移设置迁移标志。

  else:
flaglist = CONF.libvirt.live_migration_flag.split(',')
flagvals = [getattr(libvirt, x.strip()) for x in flaglist]
logical_sum = reduce(lambda x, y: x | y, flagvals)

  这部分获取实时迁移标志列表,live_migration_flag这个参数定义了实时迁移的迁移标志。

 dom = self._lookup_by_name(instance["name"])

  根据给定的实例名称检索libvirt域对象。

 timer = loopingcall.FixedIntervalLoopingCall(f=None)

  获取等待完成实时迁移的时间。

  热迁移代码部分至此结束。

PS:本博客欢迎转发,但请注明博客地址及作者~

  博客地址:http://www.cnblogs.com/voidy/

  <。)#)))≦

OpenStack之虚机热迁移代码解析的更多相关文章

  1. OpenStack之虚机冷迁移代码简析

    OpenStack之虚机冷迁移代码简析 前不久我们看了openstack的热迁移代码,并进行了简单的分析.真的,很简单的分析.现在天气凉了,为了应时令,再简析下虚机冷迁移的代码. 还是老样子,前端的H ...

  2. OpenStack之虚机热迁移

    OpenStack之虚机热迁移 最近要搞虚机的热迁移,所以也就看了看虚机迁移部分的内容.我的系统是CentOS6.5,此处为基于NFS共享平台的虚机迁移.有关NFS共享服务器的搭建可以看这里. Yak ...

  3. openstack 虚机热迁移问题:虚机状态一直处于迁移中的情况处理

    前提:在偶尔的虚机热迁移中,发现虚机一直属于迁移状态中. 但是查看后台流量监控,发现没有流量已经下来了.然后在目标机器上查看,发现kvm已经在目标机器上. 1.查看kvm 实际所处宿主机方法: a.拿 ...

  4. OpenStack的Resize和冷迁移代码解析及改进

    原文:http://www.hengtianyun.com/download-show-id-79.html OpenStack的Resize(升级)功能,我们可以改变虚拟机的CPU核数.内存及磁盘大 ...

  5. v2v-VMware/VSphere中虚机离线迁移至openstack平台

    先决条件 exsi到openstack的迁移,分为两种,一种是静态迁移,另一种是在线迁移. 静态迁移(offline migration)也叫做常规迁移,离线迁移.在迁移之前将虚拟机暂停,同时拷贝虚拟 ...

  6. OpenStack 创建虚机过程简要汇总

    1. 总体流程 翻译自原文(英文):https://ilearnstack.com/2013/04/26/request-flow-for-provisioning-instance-in-opens ...

  7. 010.KVM虚机冷迁移

    一 实验环境 原虚机名称:vm01-centos6.8 原虚机所在宿主机:kvm-host-2 迁移后虚机名称:vm01-cloud-centos6.8 迁移后虚机所在宿主机:kvm-host-2 二 ...

  8. openstack新建虚机、网络、路由时候对应的ovs网桥的变化

    [root@wb5 ~]# [root@wb5 ~]# [root@wb5 ~]# [root@wb5 ~]# [root@wb5 ~]# [root@wb5 ~]# [root@wb5 ~]# [r ...

  9. openstack nova 虚机镜像后端提取

    参考链接:https://www.cnblogs.com/storymedia/p/4500186.html 1.nova 创建的虚机后端目录 其中的base是虚机基础镜像,创建虚机会根据这个基础镜像 ...

随机推荐

  1. Python学习笔记-day1(while流程控制)

    count = 0 while True: #print('count:',count) if count == 3: print('you guess over 3 times!fuck off!' ...

  2. 数据库SQL优化大总结之 百万级数据库优化方案2

      网上关于SQL优化的教程很多,但是比较杂乱.近日有空整理了一下,写出来跟大家分享一下,其中有错误和不足的地方,还请大家纠正补充. 这篇文章我花费了大量的时间查找资料.修改.排版,希望大家阅读之后, ...

  3. 数据字典的设计--4.DOM对象的ajax应用

    需求:点击下拉选项框,选择一个数据类型,在表单中自动显示该类型下所有的数据项的名称,即数据库中同一keyword对应的所有不重复的ddlName.      1.在dictionaryIndex.js ...

  4. tomcat jvm参数优化

    根据gc(垃圾回收器)的选择,进行参数优化 JVM给了三种选择:串行收集器.并行收集器.并发收集器,但是串行收集器只适用于小数据量的情况,所以这里的选择主要针对并行收集器和并发收集器. -XX:+Us ...

  5. 使用后台程序的第一个表单Form

    参考手册:http://www.yiichina.com/doc/guide/2.0/start-forms 1.创建模型:advanced\backend\models\moxing.php 此模型 ...

  6. Android(java)学习笔记81:在TextView组件中利用Html插入文字或图片

    1. TextView中利用Html插入文字或者图片: 首先我们看看代码: (1)activity_main.xml: <LinearLayout xmlns:android="htt ...

  7. 【HHHOJ】NOIP2018 模拟赛(二十五) 解题报告

    点此进入比赛 得分: \(100+100+20=220\)(\(T1\)打了两个小时,以至于\(T3\)没时间打了,无奈交暴力) 排名: \(Rank\ 8\) \(Rating\):\(+19\) ...

  8. python 基础之列表切片内置方法

    列表操作 c=['cx','zrd','ajt','dne'] #定义一个列表,有4个元素 #增删改查 print(c[3]) #从0计数 测试 D:\python\python.exe D:/unt ...

  9. {"errmsg":"invalid weapp pagepath hint: [IunP8a07243949]","errcode":40165}微信的坑

    使用微信官方文档,发送请求会报错--   pagepath无效! 正确修改-- 将标红的pagepath改成 page与上面相同即可

  10. java Html&JavaScript面试题:判断第二个日期比第一个日期大

    如何用脚本判断用户输入的的字符串是下面的时间格式2004-11-21 必须要保证用户的输入是此格式,并且是时间,比如说月份不大于12等等,另外我需要用户输入两个,并且后一个要比前一个晚,只允许用JAV ...