Complex Instance Placement
转自: https://specs.openstack.org/openstack/openstack-user-stories/user-stories/proposed/complex-instance-placement.html
This work is licensed under a Creative Commons Attribution 3.0 Unported License.http://creativecommons.org/licenses/by/3.0/legalcode
Complex Instance Placement
Problem description
Problem Definition
An IP Multimedia Subsystem (IMS) core [2] is a key element of Telco infrastructure, handling VoIP device registration and call routing. Specifically, it provides SIP-based call control for voice and video as well as SIP based messaging apps.
An IMS core is mainly a compute application with modest demands on storage and network - it provides the control plane, not the media plane (packets typically travel point-to-point between the clients) so does not require high packet throughput rates and is reasonably resilient to jitter and latency.
As a core Telco service, the IMS core must be deployable as an HA service capable of meeting strict Service Level Agreements (SLA) with users. Here HA refers to the availability of the service for completing new call attempts, not for continuity of existing calls. As a control plane rather than media plane service the user experience of an IMS core failure is typically that audio continues uninterrupted but any actions requiring signalling (e.g. conferencing in a 3rd party) fail. However, it is not unusual for client to send periodic SIP “keep-alive” pings during a call, and if the IMS core is not able to handle them the client may tear down the call.
An IMS core must be highly scalable, and as an NFV function it will be elastically scaled by an NFV orchestrator running on top of OpenStack. The requirements that such an orchestrator places on OpenStack are not addressed in this use case.
Opportunity/Justification
Currently OpenStack supports basic workload affinity/anti-affinity using a concept called server groups. These allow for creation of groups of instances whereby each instance in the group has either affinity or anti-affinity (depending on the group policy) towards all other instances in the group. There is however no concept of having two separate groups of instances where the instances in the group have one policy towards each other, and a different policy towards all instances in the other group.
Additionally there is no concept of expressing affinity rules that can control how concentrated the members of a server group can be - that is, how tightly packed members of a server group can be onto any given hosts. For some applications it may be desirable to pack tightly, to minimise latency between them; for others, it may be undesirable, as then the failure of any given host can take out an unacceptably high percentage of the total application resources. Such requirements can partially be met with so called “soft” affinity and anti-affinity rules (if implemented) but may require more advanced policy knobs to set how much packing or spread is too much.
Although this user story is written from a particular virtual IMS use case, it is generally applicable to many other NFV applications and more broadly to any applications which have some combination of:
- Performance requirements that are met by packing related workloads; or
- Resiliency requirements that are met by spreading related workloads
Requirements Specification
Use Cases
- As a communication service provider, I want to deploy a highly available IMS core as a Virtual Network Function running on OpenStack so that I meet my SLAs.
- As an enterprise operator, I want to deploy my traditional database server shards such that they are not on the same physical nodes so that I avoid a service outage due to failure of a single node.
Usage Scenarios Examples
Project Clearwater [3] is an open-source implementation of an IMS core designed to run in the cloud and be massively scalable. It provides P/I/S-CSCF functions together with a BGCF and an HSS cache, and includes a WebRTC gateway providing interworking between WebRTC & SIP clients.
Related User Stories
Requirements
The problem statement above leads to the following requirements.
Compute application
OpenStack already provides everything needed; in particular, there are no requirements for an accelerated data plane, nor for core pinning nor NUMA.
HA
Project Clearwater itself implements HA at the application level, consisting of a series of load-balanced N+k pools with no single points of failure [4].
To meet typical SLAs, it is necessary that the failure of any given host cannot take down more than k VMs in each N+k pool. More precisely, given that those pools are dynamically scaled, it is a requirement that at no time is there more than a certain proportion of any pool instantiated on the same host. See Gaps below.
That by itself is insufficient for offering an SLA, though: to be deployable in a single OpenStack cloud (even spread across availability zones or regions), the underlying cloud platform must be at least as reliable as the SLA demands. Those requirements will be addressed in a separate use case.
Elastic scaling
An NFV orchestrator must be able to rapidly launch or terminate new instances in response to applied load and service responsiveness. This is basic OpenStack nova function.
Placement zones
In the IMS architecture there is a separation between access and core networks, with the P-CSCF component (Bono - see [4]) bridging the gap between the two. Although Project Clearwater does not yet support this, it would in future be desirable to support Bono being deployed in a DMZ-like placement zone, separate from the rest of the service in the main MZ.
Gaps
The above requirements currently suffer from these gaps:
Affinity for N+k pools
An N+k pool is a pool of identical, stateless servers, any of which can handle requests for any user. N is the number required purely for capacity; k is the additional number required for redundancy. k is typically greater than 1 to allow for multiple failures. During normal operation N+k servers should be running.
Affinity/anti-affinity can be expressed pair-wise between VMs, which is sufficient for a 1:1 active/passive architecture, but an N+k pool needs something more subtle. Specifying that all members of the pool should live on distinct hosts is clearly wasteful. Instead, availability modelling shows that the overall availability of an N+k pool is determined by the time to detect and spin up new instances, the time between failures, and the proportion of the overall pool that fails simultaneously. The OpenStack scheduler needs to provide some way to control the last of these by limiting the proportion of a group of related VMs that are scheduled on the same host.
External References
- [1] https://wiki.openstack.org/wiki/TelcoWorkingGroup/UseCases#Virtual_IMS_Core
- [2] https://en.wikipedia.org/wiki/IP_Multimedia_Subsystem
- [3] http://www.projectclearwater.org
- [4] http://www.projectclearwater.org/technical/clearwater-architecture/
- [5] https://review.openstack.org/#/c/247654/
- [6] https://blueprints.launchpad.net/nova/+spec/generic-resource-pools
Rejected User Stories / Usage Scenarios
None.
Glossary
- NFV - Networks Functions Virtualisation, see http://www.etsi.org/technologies-clusters/technologies/nfv
- IMS - IP Multimedia Subsystem
- SIP - Session Initiation Protocol
- P/I/S-CSCF - Proxy/Interrogating/Serving Call Session Control Function
- BGCF - Breakout Gateway Control Function
- HSS - Home Subscriber Server
- WebRTC - Web Real-Time-Collaboration
Complex Instance Placement的更多相关文章
- 《转》 Openstack Grizzly 指定 compute node 创建 instance
声明:此文档仅仅做学习交流使用,请勿用作其它商业用途 作者:朝阳_tony 邮箱:linzhaolover@gmail.com 2013年6月4日9:37:44 星期二 转载请注明出处:http:// ...
- Servers
Servers¶ Server interface. class novaclient.v1_1.servers.Server(manager, info, loaded=False) Bases: ...
- UIView 的粗浅解析
The UIView class defines a rectangular area on the screen and the interfaces for managing the conten ...
- python学习——基本数据类型
一.运算符 1.算术运算: 2.比较运算 3.赋值运算 4.逻辑运算 5.成员运算 二.基本数据类型 1.数字 1.1 整形数字和长整形数字:在32位机器上,整数的位数为32位,取值范围为-2**31 ...
- Python 中的数字到底是什么?
花下猫语:在 Python 中,不同类型的数字可以直接做算术运算,并不需要作显式的类型转换.但是,它的"隐式类型转换"可能跟其它语言不同,因为 Python 中的数字是一种特殊的对 ...
- 【转】C++易混知识点3. New Operator, Operator New, Placement New 实例分析,比较区别
我们知道,C++中引入了New 这个内置符号,很大方便了指针的使用,程序员不必关注与这块堆上新分配的内存是如何来的,如何初始化的,然后如何转换为我们想要的类型指针的.现在,我们重点来分析下这个NEW内 ...
- C++内存管理-new,delete,new[],placement new的简单使用
技术在于交流.沟通,本文为博主原创文章转载请注明出处并保持作品的完整性 首先,我们先看一下C++应用程序,使用memory的途径如下图所示 C++应用程序中申请内存基于分配器的实现(std::allo ...
- FlinkCEP - Complex event processing for Flink
https://ci.apache.org/projects/flink/flink-docs-release-1.3/dev/libs/cep.html 首先目的是匹配pattern sequenc ...
- How to scale Complex Event Processing (CEP)/ Streaming SQL Systems?
转自:https://iwringer.wordpress.com/2012/05/18/how-to-scale-complex-event-processing-cep-systems/ What ...
随机推荐
- pyspark 编写 UDF函数
pyspark 编写 UDF函数 前言 以前用的是Scala,最近有个东西要用Python,就查了一下如何编写pyspark的UDF. pyspark udf 也是先定义一个函数,例如: def ge ...
- Ios导航栏返回到指定的页面
在自己的项目实现中有这样的一个需求.一般情况下我们的导航栏返回按钮,是上个页面跳转过来,点击返回按钮返回到上来界面.但是在实际需求中有的并不是这么简单的.有的界面返回是只确定的界面.所以当时自己在实现 ...
- iOS 导航栏返回到指定页面的方法和理解
关于ios中 viewcontroller的跳转问题,其中有一种方式是采用navigationController pushViewController 的方法,比如我从主页面跳转到了一级页面,又从一 ...
- Mapreduce 进阶
场景描述 订单需要封装成为一个bean 传入reduce,然后实现排序取出top1,或者分组求和 首先要实现排序就要实现comparable接口 要实现分组top1,那么"相同的bean&q ...
- [转载]WPF控件拖动
这篇博文总结下WPF中的拖动,文章内容主要包括: 1.拖动窗口 2.拖动控件 Using Visual Studio 2.1thumb控件 2.2Drag.Drop(不连续,没有中间动画) 2.3拖动 ...
- C语言 · x的x次幂结果为10
如果x的x次幂结果为10(参见[图1.png]),你能计算出x的近似值吗? 显然,这个值是介于2和3之间的一个数字. 请把x的值计算到小数后6位(四舍五入),并填写这个小数值. 注意:只填写一个小数, ...
- Maven项目编译后classes文件中没有.xml问题
在做spring+mybatiss时,自动扫描都配置正确了,却在运行时出现了如下错误.后来查看target/classes/.../dao/文件夹下,发现只有mapper的class文件,而没有xml ...
- kafka集群中jmx端口设置
jmx端口主要用来监控kafka集群的. 在启动kafka的脚本kafka-server-start.sh中找到堆设置,添加export JMX_PORT="9999" if [ ...
- ELK(Logstash+Elasticsearch+Kibana)的原理和详细搭建
一. Elastic Stack Elastic Stack是ELK的官方称呼,网址:https://www.elastic.co/cn/products ,其作用是“构建在开源基础之上, Elast ...
- 一些有用的js插件
getfuelux.com 一系列插件合集 Ion.RangeSlider 超级牛的范围选择控件 Ion.CheckRadio Ion.Tabs Ion.Calendar Ion.ImageSlid ...