Non-Nullable Types vs C#: Fixing the Billion Dollar Mistake （转载）

One of the top suggestions (currently #15 on uservoice) for improving C# is the addition of non-nullable reference types. This is not surprising, considering the number of functions that start with a block of ‘if (x == null) throw new ArgumentNullException(“x”)’ lines. Not to mention the head-slapping bugs null pointers cause. There’s a reason Tony Hoare calls the null pointer his billion dollar mistake.
In this post I will talk about the obstacles that make adding non-nullable types to C# difficult, and propose a way fo

Obstacles

Adding non-nullable types to C# seems, on the surface, easy. Put a “!” at the end of the type to mean non-nullable, add some nullable/non-nullable conversation operators, implement a few compiler checks and you’re done… right? Oops, we just broke the cohesiveness of the language. The compiler keeps refusing to compile “new object![10]” (it can’t figure out what to fill the array with initially). Naturally, all of the generic classes that happen to use arrays also refuse to work for the same reason (goodbye List<T>) but some generic class that don’t use arrays like TaskCompletionSource<T> also fail. Bleh.

I should note at this point that C# is a mature language with mountains of already-written code that we aren’t allowed to break. If adding non-null types to the language breaks existing code, then non-null types won’t get added. We have to work within the constraints of backwards compatibility, which is tricky when removing widely-used assumptions. To give an idea of the sorts of code we can’t break, I’ve put together a list of the cases I think are important. Before reading on, you may want to spend a minute imagining how you would add non-nullability to C#. See if your idea meshes with each case in an elegant way:

Existing generic code allocates arrays:

public T[] ToArray(IReadOnlyCollection<T> items) {

    var r = new T[items.Count];

    ...

    return r;

}

Existing generic code uses default(T):

public T FirstOrDefault(IEnumerable<T> items) {

    using (var e = items.GetEnumerator()) {

        return e.MoveNext() ? e.Current : default(T);

    }

}

Existing generic code propagates generic parameters:

default(KeyValuePair<K, V>) // if K or V has no default value...

Existing generic classes that intuitively should accept non-nullable types (a.k.a. are “non-null safe”) may use constructs that assume a default value exists:
```
public class List<T> {

    ...

        _items = new T[capacity];

    ...

}
```

Classes that are intuitively non-null safe may not always initialize fields, implicitly assuming a default value is used (and you may be able to access that value via reflection):

public struct Maybe<T> {

    private readonly T _value;

    public readonly bool HasValue;

    public T Value {

        get {

            if (!HasValue) throw new InvalidOperationException();

            return _value;

        }

    }

    // default constructor creates a 'no value' instance where _value = default(T)

    public Maybe(T value) { HasValue = true; _value = value; }

}

A common pattern, ‘bool TryX(out T result)’, assumes a default value to assign to ‘result’ when failing:

bool TryParseValue(out T value) {

    if (...) {

        ...

        value = ...

        return true;

    }

    value = default(T);

    return false;

}

Some interfaces are intuitively non-null safe, but use the TryX pattern:

public interface IReadOnlyDictionary<TKey, TValue> : IReadOnlyCollection<KeyValuePair<TKey, TValue>> {

    ...

    bool TryGetValue(TKey key, out TValue value);

    ...

}

Most interfaces happen to be (somewhat) naively non-null safe (although implementations may not be), but it’s possible to create ones that aren’t:
```
public interface IMustHaveDefault<T> {

    void DoIt(T value = default(T));

}
```
Analogously, most delegates happen to be non-null safe, but it’s not guaranteed:
```
delegate void MustHaveDefault<T>(T value = default(T))
```

Existing code may extend legacy code that won’t be updated for non-nullability:

interface IBloomFilter<T> : SomeOldUnmaintainedLibrary.IHeuristicSet<T> {

    ...

}

How did your impromptu idea do?

The fundamental problem here is an assumption deeply ingrained into C#: the assumption that every type has a default value. Consider: if T doesn’t (or might not have) a default value then the compiler has nothing to use when evaluating default(T), initializing a field of type T, or initializing the items in a new array of T. This is a problem when it comes to non-null reference types because, although some reference types have a decent non-null default value (e.g. the default non-null String could be the empty string), most do not. Consider: what is the default non-null value of IEnumerator<int>? IObservable<bool>? UserControl? NetworkStream? The simple answer is that they don’t have one. The “best” you can do is give a mock instance that fails if you try to use it… but we already have that and it’s called null.

Note that there may be important cases I’ve missed here. C# is a very large language and I don’t have experience with all of its parts, particularly native interop things like unsafe and pinned. There are surely plenty of complications with respect to type inference and lambdas that I’m not exploring. I’m also going to gloss over the implications on reflection, other than to note that the result of GetType will be unable to indicate non-nullability and that this may be counter-intuitive to users. (Hopefully whatever I’ve overlooked won’t make what I propose in the next section utterly useless.)

Proposed Solution

All the obstacles I’ve mentioned can be overcome. The way I’ve approached the problem is by adding three bits of syntax to C#: an ‘is non-null’ type modifier, a ‘may be non-null’ type parameter modifier, and a ‘withdefault’ keyword to undo making a type non-nullable. The basic idea for making code non-null safe it to wrap T! into withdefault(T!) on the way in and cast back to T! on the way out.

I find it difficult to succinctly explain what I mean in prose, so I’m just going to go with a list. These are the semantic changes I would make to C# to allow non-nullable types:

Appending “!” to a (nullable) reference type means “is non-nullable”. A variable of type “object!” can reference an object, but may not be a null reference.
Appending “!” to a generic type parameter means “is potentially non-nullable”. The type parameter T in “class C<T>” can’t be “object!”, but it could be if the declaration was “class C<T!>”.
Invoking “withdefault” on a non-nullable reference type returns the associated nullable reference type but otherwise returns the same type. withdefault(object!) = object = withdefault(object), withdefault(int) = int, withdefault(int?) = int?.
For any (nullable) reference type T there is an explicit cast operator from T to T! that throws when given null.
A T! “is a” T. Consequently, for example, an IEnumerable<T!> is an IEnumerable<T> by covariance.
The expression “default(T)” is a compile error when T is potentially non-nullable.
The expression “new T[n]” is a compile error when T is potentially non-nullable and n is not a constant zero. Note that new T[] { … } may still work.
Using a potentially non-nullable type as the argument to a generic type parameter that is not marked as potentially non-nullable is a compile error.
A struct or class containing a field with a potentially non-nullable type is not given a default empty constructor by the compiler.
In a constructor, all fields with a potentially non-nullable type must be initialized before ‘this’ can be used.
The type of a constructor invocation expression is now non-null when the constructed type is a reference type. For example, “new object()” has type “object!”.
A few existing compiler errors, like disallowing constraining a generic parameter by a sealed type, need to be removed or rethought (because T! is a T, even if T is sealed).

Additionally, the following things should not be breaking changes, when done by the user:

Changing usage of a type T to withdefault(T) or vice versa, unless T is potentially non-nullable. This allows tweaking the return types of generic methods to make sense when T is non-nullable, without breaking existing code.
Changing a type argument from withdefault(T) to T when the type parameter is covariant, or vice versa when the type parameter is contra-variant. This is useful for interop with legacy code because we can expose non-nullability right away without painting ourselves into a corner. For example, suppose IEnumerable<T> has not been marked non-null safe. A non-null safe class can implement IEnumerable<withdefault(T)> in the interim and, once IEnumerable is made non-null safe, implementing IEnumerable<T> instead will not break existing code because a T! is a T.

Finally, some useful additions to the .Net base class library that I would recommend, although they aren’t necessary:

A special method to create an array of a non-null type from an IReadOnlyCollection<T!>.
A non-null safe array type that can be initialized incrementally (perhaps a better name would be CappedList<T!>).
A standard maybe/option type that is non-null safe.

Given these language changes, it is relatively simple to update/implement code with non-null safety. As I’ve already mentioned, you just abuse withdefault(T!) and casting from T to T! to assert to the compiler that you’ve done things right (if you haven’t, you’ll get an exception during the cast at runtime). I’ll go over some examples in a moment. As more code is made non-null safe, the amount of casting and withdefault-ing you need should go down.

The changes to make code non-null safe are so simple that you might expect the conversion to be automatable. Unfortunately, human judgement is necessary in a some cases. For example, the return type of FirstOrDefault<T!> is withdefault(T!) but the return type of First<T!> is just T!. Doing the conversion automatically would require analyzing the implementations of those methods to figure out that default(T) might flow out of FirstOrDefault<T> but not First<T>. But even with a magical halt-problem-avoiding analysis, we’d find it impossible to infer the non-nullability of interfaces and abstract classes, because their implementing code may not even be in the same assembly! We must update the code by hand.

Examples

In order to give you a more concrete taste of how non-nullable types would work in practice, given that this proposal were implemented, I’ve prepared two examples. The first is a simple maybe/option type that is non-null safe:

///May contain a value or no value. Non-null safe.

public struct Maybe<T!> {

    private readonly withdefault(T) _value;

    public readonly bool HasValue;

    public T Value {

        get {

            if (!HasValue) throw new InvalidOperationException();

            return (T)_value;

        }

    }

    // note: has default constructor for 'no value'

    public Maybe(T value) { HasValue = true; _value = value; }

}

As you can see, the _value field is of type “withdefault(T)” but exposed as type T by using a cast inside the Value property. Note that if you tried to change the field to type T, the compiler would omit the default constructor. As a result, you would need to implement it explicitly (otherwise you can’t create the No Value instance) and, in doing so, would discover it to be impossible to satisfy the requirement that _value be initialized before ‘this’ can be accessed. Most classes would be updated in the same way: withdefault in, cast out.

The second example I have is more involved, because it uses a real existing class. I call it “how to make System.Collections.Generic.Dictionary non-null safe (in spirit)”. I used reflector to get source code for Dictionary (and cleaned it up a bit with ReSharper). However, to keep the example short, I am only including the notable changes necessary to upgrade the important public methods (Add, Remove, this[]) and the implementation of the IReadOnlyDictionary interface. Additions are highlighted in green, deletions are struck-through and highlighted in red:

public interface IEnumerable<out T!> : IEnumerable {

...

public interface IReadOnlyCollection<out T!> : IEnumerable<T> {

...

public interface IReadOnlyDictionary<TKey!, TValue!> : IReadOnlyCollection<KeyValuePair<TKey, TValue>> {

    bool ContainsKey(TKey key);

    bool TryGetValue(TKey key, out withdefault(TValue) value);

    TValue this[TKey key] { get; }

...

public class Dictionary<TKey!, TValue!> : IReadOnlyDictionary<TKey, TValue> {

    private int[] _buckets;

    private int _count;

    private Entry[] _entries;

    private int _freeCount;

    ...

            for (var j = ; j < _count; j++) {

                if ((_entries[j].HashCode >= ) && comparer.Equals((TValue)_entries[j].Value, value))

                    return true;

            }

    ...

            for (var i = _buckets[num%_buckets.Length]; i >= ; i = _entries[i].Next) {

                if (_entries[i].HashCode == num && Comparer.Equals((TKey)_entries[i].Key, key))

                    return i;

            }

    ...

    internal withdefault(TValue) GetValueOrDefault(TKey key) {

        var index = FindEntry(key);

        if (index >= ) return _entries[index].Value;

        return default(withdefault(TValue));

    }

    ...

        for (var i = _buckets[index]; i >= ; i = _entries[i].Next) {

            if ((_entries[i].HashCode == num) && Comparer.Equals((TKey)_entries[i].Key, key)) {

                if (add)

    ...

            for (var i = _buckets[index]; i >= ; i = _entries[i].Next) {

                if ((_entries[i].HashCode == num) && Comparer.Equals((TKey)_entries[i].Key, key)) {

                    ...

                    _entries[i].HashCode = -;

                    _entries[i].Next = _freeList;

                    _entries[i].Key = default(withdefault(TKey));

                    _entries[i].Value = default(withdefault(TValue));

                    _freeList = i;

                    _freeCount++;

                    _version++;

                    return true;

    ...

            for (var k = ; k < _count; k++) {

                if (destArray[k].HashCode != -)

                    destArray[k].HashCode = Comparer.GetHashCode((TKey)destArray[k].Key) & 0x7fffffff;

            }

    ...

    public bool TryGetValue(TKey key, out withdefault(TValue) value) {

        var index = FindEntry(key);

        if (index >= ) {

            value = _entries[index].Value;

            return true;

        }

        value = default(withdefault(TValue));

        return false;

    }

    ...

    public TValue this[TKey key] {

        ...

                return (TValue)_entries[index].Value;

    ...

    private struct Entry {

        public int HashCode;

        public int Next;

        public withdefault(TKey) Key;

        public withdefault(TValue) Value;

    }

    ...

    public struct Enumerator : IEnumerator<KeyValuePair<TKey, TValue>> {

        ...

        private withdefault(KeyValuePair<TKey, TValue>) _current;

        ...

        public bool MoveNext() {

            while (_index < _dictionary._count) {

                if (_dictionary._entries[_index].HashCode >= ) {

                    _current = new KeyValuePair<TKey, TValue>(

                        (TKey)_dictionary._entries[_index].Key,

                        (TValue)_dictionary._entries[_index].Value);

                    _index++;

                    return true;

                }

                this._index++;

            }

            this._index = this._dictionary._count + ;

            this._current = default(withdefault(KeyValuePair<TKey, TValue>))new KeyValuePair<TKey, TValue>();

            return false;

        }

    ...

Once again you can see the “write withdefault(T), read T by casting” technique, except it is used in several places. Otherwise the only notable change is to the signature of TryGetValue: the out parameter now has type withdefault(TValue). You might expect this to break existing code, because we’re changing the signature, but it works out that we only change the signature in new cases. TValue couldn’t be a non-nullable reference type before, and withdefault(T) = T in that case.

Summary

Adding non-null types to C# is do-able, but not simple and not cheap. I’m sure it overcomes the features start at -100 points threshold, but that’s before considering the implementation costs. Even if the feature was already implemented in the language, there are mountains of existing classes that need to be updated.
We may never see non-null types in C#, but I hope we do.

原文链接

Non-Nullable Types vs C#: Fixing the Billion Dollar Mistake （转载）的更多相关文章

Unity使用可空类型（Nullable Types）
译林军范春彦|2014-04-09 09:46|5407次浏览|Unity(375)0 你怎么确定一个Vector3,int,或float变量是否被分配了一个值?一个方便的方式就是使用可空类型! 有 ...
【你吐吧c#每日学习】10.30 C#Nullable Types
分两种类型,value type and reference type. By default, value type owns a default value. For integer, the d ...
Unity3d游戏开发中使用可空类型（Nullable Types）
你怎么确定一个Vector3,int,或float变量是否被分配了一个值?一个方便的方式就是使用可空类型! 有时变量携带重要信息,但仅仅有在特定的游戏事件发生时触发.比如:一个角色在你的游戏可能闲置, ...
[Typescript 2] Nullable Types - Avoiding null and undefined Bugs
For example you have a TS app: enum PaylerPosition { Guard, Forward, Center } interface Player { nam ...
Visual Studio 2019 preview中体验C# 8.0新语法
准备工作: Visual Studio 2019 Preview版本中并没有包含所有的C# 8.0的新功能,但目前也有一些可以试用了.在开始之前,需要进行入两项设置: 将Framework设置为.ne ...
C# 7 新特性-2
在之前的C# 7 新特性博客中,我们谈到了Tuples,Record Type和Pattern Matching.这些都是C#新特性中最可能出现的.在本博客中,我们会提到更多的一些特性,虽然这些特性不 ...
Kotlin重新学习及入门示例
在2017和2018其实已经对Kotlin的基础语法进行了一些学习,但是!!如今已经是2019年,中间间断时间已经很长了,所以准备接下来从0再次出发深入系统完整的来审视一下该语言,毕境如今它的地位是越 ...
Kotlin 语言高级安卓开发入门
过去一年,使用 Kotlin 来为安卓开发的人越来越多.即使那些现在还没有使用这个语言的开发者,也会对这个语言的精髓产生共鸣,它给现在 Java 开发增加了简单并且强大的范式.Jake Wharton ...
编程语言大牛王垠：编程的智慧，带你少走弯路 [本文转载CocoaChina]
作者:王垠授权本站转载. 编程是一件创造性的工作,是一门艺术.精通任何一门艺术,都需要很多的练习和领悟,所以这里提出的“智慧”,并不是号称三天瘦二十斤的减肥药,它并不能代替你自己的勤奋.然而我希望它 ...

随机推荐

IoDH 实现的单例模式
饿汉式单例类不能实现延迟加载,不管将来用不用始终占据内存:懒汉式单例类线程安全控制烦琐,而且性能受影响.有种更好的单例模式叫做Initialization Demand Holder (IoDH)的技 ...
CentOS 7运维管理笔记(11)----PHP安装与配置
PHP的安装同样需要经过环境检查.编译和安装3个步骤. 1.首先用百度搜索 “PHP:Downloads”, 点击第一个网页: 选择5.5.37版本,选择 .tar.gz 格式的文件: 来到镜像列表网 ...
用 State Pattern 来实现一个简单的状态机
首先要理解 State Pattern 模式. http://www.dofactory.com/net/state-design-pattern Definition Allow an object ...
十、一行多个：使用float布局的经典方法 ---接（一）
1.使用float必须要清除float:即在使用float元素的父级元素上清除float. 清除float的方法有三种,在父元素上加:1.width: 100% 或者固定宽度 +overflow:hi ...
天诛进阶之D算法 #3700
http://mp.weixin.qq.com/s/ngn98BxAOLxXPlLU8sWH_g 天诛进阶之D算法 #3700 2015-11-24 yevon_ou 水库论坛天诛进阶之D算法 #3 ...
easyui学习笔记6—基本的Accordion(手风琴)
手风琴也是web页面中常见的一个控件,常常用在网站后台管理中,这里我们看看easyui中基本的手风琴设置. 1.先看看引用的资源 <meta charset="UTF-8" ...
「C语言」原码反码补码与位运算
尽管能查到各种文献,亲自归纳出自己的体系还是更能加深对该知识的理解. 本篇文章便是在结合百度百科有关原码.反码.补码和位运算的介绍并深度借鉴了张子秋和Liquor相关文章后整理而出. 目录 ...
bzoj4600 [Sdoi2016]硬币游戏
Description Alice和Bob现在在玩的游戏,主角是依次编号为1到n的n枚硬币.每一枚硬币都有两面,我们分别称之为正面和反面.一开始的时候,有些硬币是正面向上的,有些是反面朝上的.Alic ...
BZOJ2440:[中山市选2011]完全平方数(莫比乌斯函数)
Description 小 X 自幼就很喜欢数.但奇怪的是,他十分讨厌完全平方数.他觉得这些数看起来很令人难受.由此,他也讨厌所有是完全平方数的正整数倍的数.然而这丝毫不影响他对其他数的热爱. 这天是 ...
VS2013没有安装部署，安装图解
自vs2012后就已经没有安装向导了,VS2013安装是不带安装部署的,用 InstallShield Limited Edition for Visual Studio 解决安装部署问题第一步:“ ...

Non-Nullable Types vs C#: Fixing the Billion Dollar Mistake （转载）

Non-Nullable Types vs C#: Fixing the Billion Dollar Mistake （转载）的更多相关文章

随机推荐

热门专题