Introduction

Span<T> is a new type we are adding to the platform to represent contiguous regions of arbitrary memory, with performance characteristics on par with T[]. Its APIs are similar to the array, but unlike arrays, it can point to either managed or native memory, or to memory allocated on the stack.

// managed memory

var arrayMemory = new byte[100];

var arraySpan = new Span<byte>(arrayMemory);

// native memory

var nativeMemory = Marshal.AllocHGlobal(100);

Span<byte> nativeSpan;

unsafe {

    nativeSpan = new Span<byte>(nativeMemory.ToPointer(), 100);

}

SafeSum(nativeSpan);

Marshal.FreeHGlobal(nativeMemory);

// stack memory

Span<byte> stackSpan;

unsafe {

    byte* stackMemory = stackalloc byte[100];

    stackSpan = new Span<byte>(stackMemory, 100);

}

SafeSum(stackSpan);

As such, Span<T> is an abstraction over all types of memory available to .NET programs.

// this method does not care what kind of memory it works on

static ulong SafeSum(Span<byte> bytes) {

    ulong sum = 0;

    for(int i=0; i < bytes.Length; i++) {

        sum += bytes[i];

    }

    return sum;

}

When wrapping an array, Span<T> is not limited to pointing to the first element of the array. It can point to any sub-range. In other words, it supports slicing.

var array = new byte[] { 1, 2, 3 };

var slice = new Span<byte>(array, start:1, length:2);

Console.WriteLine(slice[0]); // prints 2

API Surface

The full API surface of Span<T> is not yet finalized, but the main APIs we will expose are the following:

public struct Span<T> {

    public Span(T[] array)

    public Span(T[] array, int index)

    public Span(T[] array, int index, int length)

    public unsafe Span(void* memory, int length)

    public static implicit operator Span<T> (ArraySegment<T> arraySegment);

    public static implicit operator Span<T> (T[] array);

    public int Length { get; }

    public ref T this[int index] { get; }

    public Span<T> Slice(int index);

    public Span<T> Slice(int index, int length);

    public bool TryCopyTo(T[] destination);

    public bool TryCopyTo(Span<T> destination);

    public T[] ToArray();

}

In addition, we will provide a read-only version of Span<T>. The ReadOnlySpan<T> is required to represent slices of immutable and read-only structures, e.g. System.String slices. This is discussed below.

Scenarios

Span<T> is a small, but critical, building block for a much larger effort to provide .NET APIs to enable development of high scalability server applications.

The .NET Framework design philosophy has focused almost solely on productivity for developers writing application software. In addition, many of the Framework’s design decisions were made assuming Windows client-server applications circa 1999. This design philosophy is a big part of .NET’s success as .NET is universally viewed as a very high productivity platform.

But the landscape has shifted since our platform was conceived almost 20 years ago. We now target non-Windows operating systems, our developers write cloud hosted services demanding different tradeoffs than client-server applications, the state of the art patterns have moved away from once popular technologies like XML, UTF16, SOAP (to name a few), and the hardware running today’s software is very different than what was available 20 years ago.

When we analyze the gaps we have today and the requirements of today’s high scale servers, we realize that we need to provide modern no-copy, low-allocation, and UTF8 data transformation APIs that are efficient, reliable, and easy to use. Prototypes of such APIs are available in corefxlab repository, and Span<T> is one of the main fundamental building blocks for these APIs.

Data Pipelines

Modern servers are often designed as, often reactive, pipelines of components doing transformations on byte buffers. For example, such pipeline in a web server might consist of the following transformations: socket fills in a buffer -> HTTP parsing -> decompression -> Base 64 decoding -> routing -> HTML writing -> HTML escaping -> HTTP writing -> compression -> socket writing.

Span<byte> is very useful for implementing transformation routines of such data pipelines. First, Span<T> allows the server to freely switch between managed and native buffers depending on situation/settings. For example, Windows RIO sockets work best with native buffers, and libuv Kestrel works best with pinned managed arrays. Secondly, it allows complicated transformation algorithms to be implemented in safe code without the need to resort to using raw pointers. Lastly, the fact that Span<T> is slicable allows the pipeline to abstract away the physical chunks of buffers, treating them as uniform logical chunks relevant to that particular section of the pipeline.

The stack-only nature of Span<T> (see more on this below) allows pooled memory to be safely returned to the pool after the transformation pipeline completes, and allows the pipeline to pass only the relevant slice of the buffer to each transformation routine/component. In other words, Span<T> aids in lifetime management of pooled buffers, which is critical to the performance of today's servers.

Discontinuous Buffers

As alluded to before, data pipelines often process data in chunks as they arrive at a socket. This creates problems for data transformation routines such as parsing, which often have to deal with data that can reside in two or more buffers. For example, there might be a need to parse an integer residing partially in one buffer and partially in another. Since spans can abstract stack memory, they can solve this problem in a very elegant and performant way as illustrated in the following routine from ASP.NET Channels pipeline (full source):

public unsafe static uint GetUInt32(this ReadableBuffer buffer) {

    ReadOnlySpan<byte> textSpan;

    if (buffer.IsSingleSpan) { // if data in single buffer, it’s easy

        textSpan = buffer.First.Span;

    }

    else if (buffer.Length < 128) { // else, consider temp buffer on stack

        var data = stackalloc byte[128];

        var destination = new Span<byte>(data, 128);

        buffer.CopyTo(destination);

        textSpan = destination.Slice(0, buffer.Length);

    }

    else {

        // else pay the cost of allocating an array

        textSpan = new ReadOnlySpan<byte>(buffer.ToArray());

    }

    uint value;

    var utf8Buffer = new Utf8String(textSpan);

    // yet the actual parsing routine is always the same and simple

    if (!PrimitiveParser.TryParseUInt32(utf8Buffer, out value)) {

        throw new InvalidOperationException();

    }

    return value;

}

Non-Allocating Substring

Modern server protocols are more often than not text-based, and so it's not surprising that such servers often create and manipulate lots of strings.

One of the most common basic string operations is string slicing. Currently, System.String.Substring is the main .NET API for creating slices of a string, but the API is inefficient as it creates a new string to represent the slice and copies the characters from the original string to the new string slice. Because of this inefficiency, high performance servers shy away from using this API, where they can (in their internals), and pay the cost in the publicly facing APIs.

ReadOnlySpan<char> could be a much more efficient standard representation of a subsection of a string:

public struct ReadOnlySpan<T> {

    public ReadOnlySpan(T[] array)

    public ReadOnlySpan(T[] array, int index)

    public ReadOnlySpan(T[] array, int index, int length)

    public unsafe ReadOnlySpan(void* memory, int length)

    public int Length { get; }

    public T this[int index] { get; }

    public ReadOnlySpan <T> Slice(int index)

    public ReadOnlySpan <T> Slice(int index, int count)

    public bool TryCopyTo(T[] destination);

    public bool TryCopyTo(Span<T> destination);

}

ReadOnlySpan<char> lengthText = "content-length:123".Slice(15);

Parsing

Currently, the .NET parsing APIs require the exact string representing the text being parsed to be passed as the argument to the APIs:

    int value = int.Parse("123");

    int value = int.Parse("content-length:123".Substring(15)); // this allocates

We do not have APIs that can parse a slice of a string or a text buffer without the need to first allocate a substring representing the text being parsed. ReadOnlySpan<char>-based APIs, together with non-allocating substring APIs discussed above, could solve this problem:

public static class SpanParsingExtensions {

    static bool TryParse(this ReadOnlySpan<char> text, out int value)

}

"content-length:123".Slice(15).TryParse(out int value);

The API can be further improved (take in Span<byte>) to parse text buffers regardless of the encoding (e.g. UTF8)

public static class SpanParsingExtensions {

    static bool TryParse(this ReadOnlySpan<byte> text, EncodingData encoding, out int value)

}

var byteArray = new byte[]{49, 50, 51};

var bytesFromStringSlice = "content-length:123".Slice(15).As<byte>();

var bytesFromUtf8StringSlice = new Utf8String("content-length:123").Slice(15);

byteArray.TryParse(EncodingData.Utf8, out int value);

bytesFromStringSlice.TryParse(EncodingData.Utf16, out int value);

bytesFromUtf8StringSlice.TryParse(EncodingData.Utf8, out int value);

Formatting

Similarly, formatting (the reverse of parsing) can be very elegantly and efficiently done on existing memory buffers backed by slices of arrays, native buffers, and stack allocated arrays. For Example, the following routine from corfxlab formats an integer (as UTF8 text) into an arbitrary byte buffer:

public static bool TryFormat(this int value, Span<byte> buffer, out int bytesWritten)

Buffer Pooling

Span<T> can be used to pool memory from a large single buffer allocated on the native heap. This decreases [pointless] work the GC needs to perform to manage pooled buffers, which never get collected anyway, but often need to be permanently pinned, which is bad for the system. Also, the fact that native memory does not move lowers the cost of interop and the cost of pool related error checking (e.g. checking if a buffer is already returned to the pool).

Separatelly, the stack-only nature of Span<T> makes lifetime management of pooled memory more relaible; it helps in avoiding use-after-free errors with pooled memory. Without Span<T>, it’s often not clear when a pooled buffer that was passed to a separate module can be returned to the pool, as the module could be holding to the buffer for later use. With Span<T>, the server pipeline can be sure that there are no more references to the buffer after the stack pops to the frame that first allocated the span and passed it down to other modules.

Native code interop

Today, unmanaged buffers passed over unmanaged to managed boundary are frequently copied to byte[] to allow safe access from managed code. Span<T> can eliminate the need to copy in many such scenarios.

Secondly, a number of performance critical APIs in the Framework take unsafe pointers as input. Examples include Encoding.GetChars or Buffer.MemoryCopy. Over time, we should provide more safe APIs that use Span<T>, which will allow more code to compile as safe but still preserve its performance characteristics.

Requirements

To support the scenarios described above, Span<T> must meet the following requirements:

Ability to wrap managed and native memory
Performance characteristics on par with arrays
Be memory-safe

Design/Representation

We will provide two different implementations of Span<T>:

Fast Span<T> (available on runtimes with special support for spans)
Slow Span<T> (available on all current .NET runtimes, even existing ones, e.g. .NET 4.5)

The fast implementation, will rely on "ref field" support and will look as follows:

public struct Span<T> {

    internal ref T _pointer;

    internal int _length;

}

A prototype of such fast Span<T> can be found here. Through the magic of the "ref field", it can support slicing without requiring a strong pointer to the root of the sliced object. The GC is able to trace the interior pointer, keep the root object alive, and update the interior pointer if the object is relocated during a collection.

A different representation will be implemented for platforms that don’t support ref fields (interior pointers):

public struct Span<T> {

    internal IntPtr _pointer;

    internal object _relocatableObject;

    internal int _length;

}

A prototype of this design can be found here. In this representation, the Span<T>'s indexer will add the _pointer and the address of _relocatableObject before accessing items in the Span. This will make the accessor slower, but it will ensure that when the GC moves the sliced object (e.g. array) in memory, the indexer still accesses the right memory location. Note that if the Span wraps a managed object, the _pointer field will be the offset off the object's root to the objects data slice, but if the Span wraps a native memory, the _pointer will point to the memory and the _relocatableObject will be set to null (zero). In either case, adding the pointer and the address of the object (null == 0) results in the right "effective" address.

Struct Tearing

Struct tearing is a threading issue that affects all structs larger than what can be atomically updated on the target processor architecture. For example, some 64-bit processors can only update one 64-bit aligned memory block atomically. This means that some processors won’t be able to update both the _pointer and the _length fields of the Span atomically. This in turn means that the following code, might result in another thread observing _pointer and _length fields belonging to two different spans (the original one and the one being assigned to the field):

internal class Buffer {

    Span<byte> _memory = new byte[1024];

    public void Resize(int newSize) {

        _memory = new byte[newSize]; // this will not update atomically

    }

    public byte this[int index] => _memory[index]; // this might see partial update

}

For most structs, tearing is at most a correctness bug and can be dealt with by making the fields (typed as the tearable struct type) non-public and synchronizing access to them. But since Span needs to be as fast as the array, access to the field cannot be synchronized. Also, because of the fact that Span accesses (and writes to) memory directly, having the _pointer and the _length be out of sync could result in memory safety being compromised.

The only other way (besides synchronizing access, which would be not practical) to avoid this issue is to make Span a stack-only type, i.e. its instances can reside only on the stack (which is accessed by one thread).

Span<T> will be stack-only

Span<T> will be a stack-only type; more precisely, it will be a by-ref type (just like its field in the fast implementation). This means that Spans cannot be boxed, cannot appear as a field of a non-stack-only type, and cannot be used as a generic argument. However, Span<T> can be used as a type of method arguments or return values.

We chose to make Span<T> stack-only as it solves several problems:

Efficient representation and access: Span<T> can be just managed pointer and length.
Efficient GC tracking: limit number of interior pointers that the GC have to track. Tracking of interior pointers in the heap during GC would be pretty expensive.
Safe concurrency (struct tearing discussed above): Span<T> assignment does not have to be atomic. Atomic assignment would be required for storing Span<T> on the heap to avoid data tearing issues.
Safe lifetime: Safe code cannot create dangling pointers by storing it on the heap when Span<T> points to unmanaged memory or stack memory. The unsafe stack frame responsible for creating unsafe Span is responsible for ensuring that it won’t escape the scope.
Reliable buffer pooling: buffers can be rented from a pool, wrapped in spans, the spans passed to user code, and when the stack unwinds, the program can reliably return the buffer to the pool as it can be sure that there are no outstanding references to the buffer.

The fast representation makes the type instances automatically stack-only, i.e. the constraint will be enforced by the CLR. This restriction should also be enforced by managed language compilers and/or analyzers for better developer experience. For the slow Span<T>, language compiler checks and/or analyzers is the only option (as the runtimes won't enforce the stack-only restriction).

Memory<T>

As alluded to above, in the upcoming months, many data transformation components in .NET (e.g. Base64Encoding, compressions, formatting, parsing) will provide APIs operating on memory buffers. We will do this work to develop no-copy/low-allocation end-to-end data pipelines, like the ASP.NET Channels. These APIs will use a collection of types, including, but not limited to, Span<T>, to represent various data pipeline primitives and exchange types.

This new collection of types must be usable by two distinct sets of customers:

Productivity developers (99% case): these are the developers who use LINQ, async, lambdas, etc., and often for good reasons care more about productivity than squeezing the last cycles out of some low level transformation routines.
Low level developers (1% case): our library and framework authors for whom performance is a critical aspect of their work.

Even though the goals of each group are different, they rely on each other to be successful. One is a necessary consumer of the other.

A stack-only type with the associated trade-offs is great for low level developers writing data transformation routines. Productivity developers, writing apps, may not be so thrilled when they realize that when using stack-only types, they lose many of the language features they rely on to get their jobs done (e.g. async await). And so, a stack-only type simply can’t be the primary exchange type we recommend for high level developers/scenarios/APIs.

For the whole platform to be successful, we must add an exchange type, currently called Memory<T>, that can be used with the full power of the language, i.e. it’s not stack-only. Memory<T> can be seen as a “promise” of a Span. It can be freely used in generics, stored on the heap, used with async await, and all the other language features we all love. When Memory<T> is finally ready to be manipulated by a data transformation routine, it will be temporarly converted to a span (the promise will be realized), which will provide much more efficient (remember "on par with array") access to the buffer's data.

See a prototype of Memory<T> here. Note that the prototype is currently not tearing safe. We will make it safe in the upcoming weeks.

Other Random Thoughts

Optimizations

We need to enable the existing array bounds check optimizations for Span<T> – in both the static compiler and the JIT – to make its performance on par with arrays. Longer term, we should optimize struct passing and construction to make slicing operations on Spans more efficient. Today, we recommend that Spans are sliced only when a shorted span needs to be passed to a different routine. Within a single routine, code should do index arithmetic to access subranges of spans.

Conversions

Span<T> will support reinterpret cast conversions to Span<byte>. It will also support unsafe casts between arbitrary primitive types. The reason for this limitation is that some processors don’t support efficient unaligned memory access.

A prototype of such API can be found here, and the API can be used as follows:

var bytes = new Span<byte>(buffer);

var characters = bytes.Cast<byte, char>();

if(char.IsLower(characters[0]) { ... }

Platform Support Plans

We want to ship Span<T> as a NuGet fat package available for .NET Standard 1.1 and above. Runtimes that support by-ref fields and returns will get the fast ref-field Span<T>. Other runtimes will get the slower three-field Span<T>.

Relationship to Array Slicing

Since Span<T> will be a stack-only type, it’s not well suited as the general representation of array slice. When an array is sliced, majority of our users expect the result to be either an array, or at least a type that is very similar to the array (e.g. ArraySegment<T>). We will design array slicing separately from Span<T>.

Covariance

Unlike T[], Span<T> will not support covariant casts, i.e. cast Span<Subtype> to Span<Basetype>. Because of that, we won’t be doing covariance checks when storing references in Span<T> instances.

Language Support

Separately from this document, we are exploring language features to better support Span<T>:

Enforcement of Stack-Only Type Restrictions

Span<T> and ReadOnlySpan<T> will be included in the set of built-in stack-only types. Any other struct containing one of these will be transitively considered a stack-only type. The compiler will error if a stack-only type is used in a disallowed context, e.g. used as a type argument, placed on the heap (boxed, passed to asynchronous call, used as a field of a class, etc.).

Language Support for pinning

Span<byte> buffer = ...

fixed(byte* pBuffer = buffer){

    ...

}

Slicing syntax

C# compiler will add slicing syntax, and Memory<T>, Span<T>, and ReadOnlySpan<T> will support it. The details are TBD, but imagine that a[1..5] calls a.Slice(1, 5) or a.Slice(new Range(1, 5)). c# Span<T> span = ... var slice = span[1..5];

Safe Span<T> stackalloc

void SafeMethod() {

    var buffer = stackalloc Span<byte>(128);

    PrimitiveFormatter.TryFormat(buffer, DateTime.Now, ...);

}

Primitive constraint

Some Span<T> operations are valid only for so called primitive type arguments. For example, the reinterpret cast operation.

We are exploring adding the ability to constrain type parameters to primitive types, i.e. types that are bit blittable. The cast operation would constrain its type parameters as follows:
```
public static Span Cast<T, U>(this Span<T> slice) where T:primitive where U:primitive

{ ... }
```

Open Issues

Detailed design of Memory<T> - Representation
- Operations
- Lifetime/Pinning
Span<T> API design details
- Type of the Length property
- Namespace and type name
Details of runtime optimizations

Span<T>的更多相关文章

CSS之div和span标签
div和span是非常重要的标签,div的语义是division"分割": span的语义就是span"范围.跨度". 这两个东西,都是最最重要的"盒 ...
如何改变span元素的宽度与高度
内联元素:也称为行内元素,当多个行内元素连续排列时,他们会显示在一行里面. 内联元素的特性:本身是无法设置宽度和高度属性的,但是可以通过CSS样式来控制,达到我们想要的宽度和高度. span举例1: ...
Android之TextView的样式类Span的使用详解
Android中的TextView是个显示文字的的UI类,在现实中的需求中,文字有各式各样的样式,TextView本身没有属性去设置实现,我们可以通过Android提供的 Spannab ...
火狐下多个span连在一起和换行写存在差异
当父元素的宽度确定,多个span换行写,span加起来占的宽度比预设的大
IE下a标签后面的span元素向右浮动后错位
错误原因span放在了a标签之后正确写法是放在之前如下: <li>2016-07-29<a href="#" ...
Jquery 关于span标签的取值赋值用法
span是最简单的容器,可以当作一个形式标签,其取值赋值方法有别于一般的页面元素. //赋值 $("#spanid").html(value) //取值 $("#span ...
div span
无牵无挂,不带任何样式,因此经常使用div完成整体样式的构建,span完成细微样式的构建. div为块级元素,span为行内元素. 使用div完成显示区域的居中.左右浮动等,完成整体的样式布局,然后在 ...
HTML5--div、span超出部分省略号显示
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/ ...
记SpannableString设多少span时注意事项
public void setSpan(Object what, int start, int end, int flags) { } 这个方法里的第一个参数,也就是一些span的对象,不能重复使用. ...
HTML 标签
定义和用法: 标签被用来组合文档中的行内元素.在行内定义一个区域,也就是一行内可以被划分成好几个区域,从而实现某种特定效果.本身 ...

随机推荐

江苏省选2019Round1游记
JSOI2019R1过去了. Day0 打板子.看白书. 晚上太热了,楼下还在打铁,睡不着. 折腾到好晚才勉强睡着. Day1 早上好困啊. 开T1.看起来T1的60分很可做的样子?5min打完了(为 ...
Spring Boot 2 (十)：Spring Boot 中的响应式编程和 WebFlux 入门
Spring 5.0 中发布了重量级组件 Webflux,拉起了响应式编程的规模使用序幕. WebFlux 使用的场景是异步非阻塞的,使用 Webflux 作为系统解决方案,在大多数场景下可以提高系统 ...
git学习之时光穿梭机
"x"修改readme.txt文件,改成如下内容: Git is a distributed version control system. Git is free softwar ...
朱晔的互联网架构实践心得S1E8：三十种架构设计模式（下）
朱晔的互联网架构实践心得S1E8:三十种架构设计模式(下) [下载本文PDF进行阅读] 接上文,继续剩下的15个模式. 数据管理模式 16.分片模式:将数据存储区划分为一组水平分区或分片一直有一个说 ...
mybatis入门配置文件解释及测试
这里介绍一下mybatis 根据mybatis的官网说明,mybatis是一款优秀的持久层框架,它支持定制化 SQL.存储过程以及高级映射.MyBatis 避免了几乎所有的 JDBC 代码和手动设置 ...
Oja’s rule
目录 Oja's rule 背景 Hebbian learning 主要的一些理论论文里面一些主要的假设引理1 引理2 引理3 定理1 LEMMA 3(ALL) 引理 4 定理 2 定理 3(关于 ...
matplotlib 入门之Sample plots in Matplotlib
文章目录 Line Plot One figure, a set of subplots Image 展示图片展示二元正态分布 A sample image Interpolating images ...
lower_bound函数与upper_bound函数
头文件 : algorithm vector<int>a a中的元素必须升序,用的是二分 lower_bound(a.begin(),a.end(),k) 返回a容器中,最右边的小于等于k ...
Stack Sorting CodeForces - 911E （思维+单调栈思想）
Let's suppose you have an array a, a stack s (initially empty) and an array b (also initially empty) ...
P66 整环的零元
R/I=0的零因子是0+I吗? 如果不是,那请问R/I的零因子是什么呢? R/I没有零因子 R/I的零元是I中的元素定义的等价类么 a是理想I的元素,自然也是R的元素

Span<T>