OpenStack nova compute supports two flavors of Virtual Machine (VM) migration:

  • Cold migration -- migration of a VM which requires the VM to be powered off during the migrate operation during which time the VM is inaccessible.
  • Hot or live migration -- zero down-time migration whereupon the VM is not powered off during the migration and thus remains accessible.

Understanding these VM migration operations from an OpenStack internals perspective can be a daunting task. I had the pleasure of digging into these flows in the latter part of 2013 and as part of that effort created a rough outline of the internal flows. Other's I've worked with found these flow outlines useful and thus they're provided below.

Note -- The outlines below were created based on the OpenStack source in late 2013 and thus reflect the state of OpenStack at that point in time.

Live Migration Flow:
  • nova.api.openstack.compute.contrib.admin_actions._migrate_live()
  • nova.compute.api.live_migrate()
    • update instance state to MIGRATING state
    • call into scheduler to live migrate (scheduler hint will be set to the host select (which may be none)).
  • nova.scheduler.manager.live_migration()
  • nova.scheduler.manager._schedule_live_migration()
  • nova.conductor.tasks.live_migrate.LiveMigrationTask.execute()
    • check that the instance is running
    • check that the instance's host is up
    • if destination host provided, check that it..
      1. is different than the instance's host
      2. is up
      3. has enough memory
      4. is compatible with the instance's host (i.e. hypervisor type and version)
      5. passes live migration checks (call using amqp rpc into nova manager check_can_live_migrate_destination)
    • else destination host not provided, find a candidate destination host and check that it...
      1. is compatible with the instance's host (i.e. hypervisor type and version)
      2. passes live migration checks (call using amqp rpc into nova manager check_can_live_migrate_destination)
    • call using amqp rpc into nova manager live_migration
      Note: Migration data is initially set by check_can_live_migrate_destination and can be used for implementation specific parameters from this point.
  • nova.compute.manager.check_can_live_migrate_destination()
    • driver.check_can_live_migrate_destination()
    • call using amqp rpc into nova manager check_can_live_migrate_source
    • driver.check_can_live_migrate_destination_cleanup()
  • nova.compute.manager.check_can_live_migrate_source()
    • determine if the instance is volume backed and add result to the migration data
    • driver.check_can_live_migrate_source()
  • nova.compute.manager.live_migration()
    • if block migration request then driver.get_instance_disk_info()
    • call using amqp rpc into nova manager pre_live_migration
      • Error handler: _rollback_live_migration
    • driver.live_migration()
  • nova.compute.manager.pre_live_migration()
    • get the block device information for the instance
    • get the network information for the instance
    • driver.pre_live_migration()
    • setup networks on destination host by calling the network API setup_networks_on_host
    • driver.ensure_filtering_rules_for_instance()
  • nova.compute.manager._rollback_live_migration()
    • update instance state to ACTIVE state
    • re-setup networks on source host by calling the network API setup_networks_on_host
    • for each instance volume connection call using amqp rpc into nova manager remove_volume_connection
    • if block migration or volume backed migration without shared storage
      • call using amqp rpc into nova manager rollback_live_migration_at_destination
  • nova.compute.manager._post_live_migration()
    • driver.get_volume_connector()
    • for each instance volume connection call the volume API terminate_connection
    • driver.unfilter_instance()
    • call into conductor to network_migrate_instance_start which will eventually call the network API migrate_instance_start
    • call using amqp rpc into nova manager post_live_migration_at_destination
    • if block migration or not shared storage driver.destroy()
    • else driver.unplug_vifs()
    • tear down networks on source host by calling the network API setup_networks_on_host
  • nova.compute.manager.post_live_migration_at_destination()
    • setup networks on destination host by calling the network API setup_networks_on_host
    • call into conductor to network_migrate_instance_finish which will eventually call the network API migrate_instance_finish
    • driver.post_live_migration_at_destination()
    • update instance to ACTIVE state
    • setup networks on destination host by calling the network API setup_networks_on_host
  • nova.compute.manager.rollback_live_migration_at_destination()
    • tear down networks on destination host by calling the network API setup_networks_on_host
    • driver.destroy()
  • nova.compute.manager.remove_volume_connection()
    • call _detach_volume
    • driver.get_volume_connector()
    • remove the volume connection by calling the volume API terminate_connection
  • nova.compute.manager._detach_volume()
    • driver.detach_volume()

      • Since the live migration failed the VM should not be on the destination host.  So this should be a no-op.
    • If there is an exception detaching the volume then rollback the detach by calling the volume API roll_detaching
 
Cold Migration Flow:
  • nova.api.openstack.compute.servers._resize()
  • nova.api.openstack.compute.contrib.admin_actions._migrate()
  • nova.compute.api.resize()
    • if flavor_id is not passed, migrate host only and keep the original flavor
    • else flavor_id is given, migrate host and resize to new flavor
    • lookup the image for the instance by calling the image API show
    • check quota headroom and reserve
    • update instance to RESIZE_PREP state
    • determine if the instance's current host should be ignored as a migration target and update filter properties for the scheduler accordingly
    • call into scheduler to prep_resize
  • nova.scheduler.manager.prep_resize()
    • call scheduler driver to schedule_prep_resize
    • if no valid host was found then update instance to ACTIVE state and rollback quota reservation
    • if error occurred then update instance to ERROR state and rollback quota reservation
  • nova.scheduler.filter_scheduler.schedule_prep_resize()
    • run through scheduler filters to select host
    • call using amqp rpc into nova manager prep_resize
  • nova.compute.manager.prep_resize()
    • if no node specified call driver.get_available_nodes()
    • call _prep_resize
      • if an exception occurs then call into scheduler to prep_resize again if possible
  • nova.compute.manager._prep_resize()
    • if same host is used then ensure that the same host is allowed (as per configuration)
    • call using amqp rpc into nova manager resize_instance
  • nova.compute.manager.resize_instance()
    • get network and instance information
    • update instance to RESIZE_MIGRATING state
    • get block device information
    • call driver.migrate_disk_and_power_off()
    • call _terminate_volume_connections
    • call into conductor to network_migrate_instance_start which will eventually call the network API migrate_instance_start
    • update instance to RESIZE_MIGRATED state
    • call using amqp rpc into nova manager finish_resize
  • nova.compute.manager._terminate_volume_connections()
    • if there is a volume connection to terminate

      • driver.get_volume_connector()
      • for each volume connection remove the connection by calling the volume API terminate_connection
  • nova.compute.manager.finish_resize()
    • call _finish_resize
    • if successful commit the quota reservation
    • else rollback the quota reservation and update instance to ERROR state
  • nova.compute.manager._finish_resize()
    • if the flavor is changing then update the instance with the new flavor
    • setup networks on destination host by calling the network API setup_networks_on_host
    • call into conductor to network_migrate_instance_finish which will eventually call the network API migrate_instance_finish
    • update instance to RESIZE_FINISHED state
    • refresh and get block device information
    • driver.finish_migration()
    • update instance to RESIZED state
Cold migration confirm flow:
Cold migration revert flow:
    • nova.api.openstack.compute.servers._action_revert_resize()
    • nova.compute.api.revert_resize()
      • reserve quota for increase in resource usage
      • update instance task state to RESIZE_REVERTING
      • call amqp rpc into nova manager revert_resize
    • nova.compute.manager.revert_resize()
      • tear down networks on destination host by calling the network API setup_networks_on_host
      • call into conductor to network_migrate_instance_start which will eventually call the network API migrate_instance_start
      • get block device information
      • driver.destroy()
      • call _terminate_volume_connections
      • drop resize resources claimed on destination
      • call amqp rpc into nova manager finish_revert_resize
    • nova.compute.manager.finish_revert_resize()
      • update instance back to pre-resize values
      • re-setup networks on source host by calling the network API setup_networks_on_host
      • refresh and get block device information
      • driver.finish_revert_migration()
      • update instance to RESIZE_REVERTING state
      • call into conductor to network_migrate_instance_finish which will eventually call the network API migrate_instance_finish
      • update instance to ACTIVE (or possibly STOPPED) state
      • commit the quota usage

Source: http://bodenr.blogspot.com/2014/03/openstack-nova-vm-migration-live-and.html

OpenStack nova VM migration (live and cold) call flow的更多相关文章

  1. OpenStack Nova Release(Rocky to Train)

    目录 文章目录 目录 前言 演进方向 Cellv2 更新 Rocky Support disabling a cell Stein Handling a down cell Train Count q ...

  2. OpenStack Nova 制作 Windows 镜像

    OpenStack Nova 制作 Windows 镜像   windows虚拟机ubuntuimage防火墙云计算 本贴转自http://www.vpsee.com 上次 VPSee 给 OpenS ...

  3. OpenStack Nova 高性能虚拟机之 CPU 绑定

    目录 文章目录 目录 前文列表 KVM KVM 的功能列表 KVM 工具集 KVM 虚拟机的本质是什么 vCPU 的调度与性能问题 Nova 支持的 vCPU 绑定 vcpu\_pin\_set 配置 ...

  4. Openstack Nova 源码分析 — 使用 VCDriver 创建 VMware Instance

    目录 目录 前言 流程图 nova-compute vCenter 前言 在上一篇Openstack Nova 源码分析 - Create instances (nova-conductor阶段)中, ...

  5. Openstack Nova 源码分析 — RPC 远程调用过程

    目录 目录 Nova Project Services Project 的程序入口 setuppy Nova中RPC远程过程调用 nova-compute RPC API的实现 novacompute ...

  6. 如何删除 OpenStack Nova 僵尸实例

    转自:http://www.vpsee.com/2011/11/how-to-delete-a-openstack-nova-zombie-instance/ 前天强制重启一台 OpenStack N ...

  7. 深挖Openstack Nova - Scheduler调度策略

    深挖Openstack Nova - Scheduler调度策略   一.  Scheduler的作用就是在创建实例(instance)时,为实例选择出合适的主机(host).这个过程分两步:过滤(F ...

  8. OpenStack Nova

    OpenStack Nova 简介 OpenStack 中的 Nova 负责维护和管理云环境的计算资源 Nova 在现有 Linux 服务器上作为一组守护线程来提供服务 Nova 由多个服务器进程组成 ...

  9. OpenStack Nova 高性能虚拟机之 NUMA 架构亲和

    目录 文章目录 目录 写在前面 计算平台体系结构 SMP 对称多处理结构 NUMA 非统一内存访问结构 MPP 大规模并行处理结构 Linux 上的 NUMA 基本对象概念 NUMA 调度策略 获取宿 ...

随机推荐

  1. 通过 Informix 系统表监控和优化数据库

    Informix 数据库系统字典表简介 Informix 数据库服务器运行时的状态信息是数据库管理员 DBA 进行系统监控和优化的必需信息来源.Informix 的状态信息在内部以 2 种方式存在,如 ...

  2. NGUI UIToggle

    NGUI UIToggle 1.add a UI Toggle(Script) and UI Toggle Objects(Script) to a Tab Button(Which has a UI ...

  3. int转string

    #include<iostream> #include<sstream> #include<string> using namespace std; int mai ...

  4. shell变量赋值 不能有空格的原因

    典型例子: a=date echo $a      成立 a =date echo $a     不成立 其实原因很简单 shell在解释命令时的原则是第一个符号标记只能是程序或者命令,有空格的时候第 ...

  5. NGINX location 在配置中的优先级

    location表达式类型 ~ 表示执行一个正则匹配,区分大小写 ~* 表示执行一个正则匹配,不区分大小写 ^~ 表示普通字符匹配.使用前缀匹配.如果匹配成功,则不再匹配其他location. = 进 ...

  6. PHP数据库页面增删查

    1.用户操作页面 <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://ww ...

  7. Leetcode 1 two sum 难度:0

    https://leetcode.com/problems/two-sum/ class Solution { public: vector<int> twoSum(vector<i ...

  8. POJ 2184 01背包+负数处理

    Cow Exhibition Time Limit: 1000MS   Memory Limit: 65536K Total Submissions: 10200   Accepted: 3977 D ...

  9. sqoop连接oracle与mysql&mariadb的错误

    错误说明: 由于我的hadoop的集群是用cloudera manager在线自动安装的,因此他们的安装路径必须遵循cloudera的规则,这里只有查看cloudera的官方文档了,请参考:http: ...

  10. zipalign内存对齐优化

    zipalign:android中SDK下tools文件夹 用来对资源文件的内存进行对齐优化 手工命令: 优化:zipalign -v 4 source.apk destination.apk 4: ...