.NET 排序 Array.Sort<T> 实现分析

System.Array.Sort<T> 是.NET内置的排序方法, 灵活且高效, 大家都学过一些排序算法，比如冒泡排序,插入排序,堆排序等，不过你知道这个方法背后使用了什么排序算法吗?

先说结果, 实际上 Array.Sort 不止使用了一种排序算法, 为了保证不同的数据量的排序场景，都能有一个高性能的表现，实现中包括了插入排序,堆排序和快速排序, 接下来从通过源码看看它都做了哪些事情。

Array.Sort

https://source.dot.net/#System.Private.CoreLib/Array.cs,ec5718fae85b7640

public static void Sort<T>(T[] array)

{

    if (array == null)

        ThrowHelper.ThrowArgumentNullException(ExceptionArgument.array);

    if (array.Length > 1)

    {

        var span = new Span<T>(ref MemoryMarshal.GetArrayDataReference(array), array.Length);

        ArraySortHelper<T>.Default.Sort(span, null);

    }

}

这里我们对 int 数组进行排序, 先看一下这个Sort方法, 当数组的长度大于1时, 会先把数组转成 Span 列表, 然后调用了内部的ArraySortHelper的Default对象的Sort方法。

ArraySortHelper

[TypeDependency("System.Collections.Generic.GenericArraySortHelper`1")]

internal sealed partial class ArraySortHelper<T>

    : IArraySortHelper<T>

{

    private static readonly IArraySortHelper<T> s_defaultArraySortHelper = CreateArraySortHelper();

    public static IArraySortHelper<T> Default => s_defaultArraySortHelper;

    [DynamicDependency("#ctor", typeof(GenericArraySortHelper<>))]

    private static IArraySortHelper<T> CreateArraySortHelper()

    {

        IArraySortHelper<T> defaultArraySortHelper;

        if (typeof(IComparable<T>).IsAssignableFrom(typeof(T)))

        {

            defaultArraySortHelper = (IArraySortHelper<T>)RuntimeTypeHandle.CreateInstanceForAnotherGenericParameter((RuntimeType)typeof(GenericArraySortHelper<string>), (RuntimeType)typeof(T));

        }

        else

        {

            defaultArraySortHelper = new ArraySortHelper<T>();

        }

        return defaultArraySortHelper;

    }

}

Default 会根据是否实现了 IComparable<T> 接口来创建不同的 ArraySortHelper, 因为上面我对int数组进行排序, 所以调用的是 GenericArraySortHelper 的Sort方法。

GenericArraySortHelper

https://source.dot.net/#System.Private.CoreLib/ArraySortHelper.cs,280

internal sealed partial class GenericArraySortHelper<T>

        where T : IComparable<T>

    {

    // Do not add a constructor to this class because ArraySortHelper<T>.CreateSortHelper will not execute it

    #region IArraySortHelper<T> Members

    public void Sort(Span<T> keys, IComparer<T>? comparer)

    {

        try

        {

            if (comparer == null || comparer == Comparer<T>.Default)

            {

                if (keys.Length > 1)

                {

                    // For floating-point, do a pre-pass to move all NaNs to the beginning

                    // so that we can do an optimized comparison as part of the actual sort

                    // on the remainder of the values.

                    if (typeof(T) == typeof(double) ||

                        typeof(T) == typeof(float) ||

                        typeof(T) == typeof(Half))

                    {

                        int nanLeft = SortUtils.MoveNansToFront(keys, default(Span<byte>));

                        if (nanLeft == keys.Length)

                        {

                            return;

                        }

                        keys = keys.Slice(nanLeft);

                    }

                    IntroSort(keys, 2 * (BitOperations.Log2((uint)keys.Length) + 1));

                }

            }

            else

            {

                ArraySortHelper<T>.IntrospectiveSort(keys, comparer.Compare);

            }

        }

        catch (IndexOutOfRangeException)

        {

            ThrowHelper.ThrowArgumentException_BadComparer(comparer);

        }

        catch (Exception e)

        {

            ThrowHelper.ThrowInvalidOperationException(ExceptionResource.InvalidOperation_IComparerFailed, e);

        }

    }

首先会判断排序的类型是否是浮点型, 如果是的会做一些排序的调整优化，然后调用了 IntroSort 方法，并传入了两个参数，第一个Keys就是数组的Span列表，那第二个是什么呢? 它是一个int类型的depthLimit参数，这里简单点理解就是算出数组的深度，因为后边会根据这个值进行递归操作，然后进入到 IntroSort 方法。

IntroSort

到这个方法这里就清晰很多了, 这是Array.Sort<T> 排序的主要内容，接着往下看

https://source.dot.net/#System.Private.CoreLib/ArraySortHelper.cs,404

 private static void IntroSort(Span<T> keys, int depthLimit)

{

    Debug.Assert(!keys.IsEmpty);

    Debug.Assert(depthLimit >= 0);

    int partitionSize = keys.Length;

    while (partitionSize > 1)

    {

        if (partitionSize <= Array.IntrosortSizeThreshold)

        {

            if (partitionSize == 2)

            {

                SwapIfGreater(ref keys[0], ref keys[1]);

                return;

            }

            if (partitionSize == 3)

            {

                ref T hiRef = ref keys[2];

                ref T him1Ref = ref keys[1];

                ref T loRef = ref keys[0];

                SwapIfGreater(ref loRef, ref him1Ref);

                SwapIfGreater(ref loRef, ref hiRef);

                SwapIfGreater(ref him1Ref, ref hiRef);

                return;

            }

            InsertionSort(keys.Slice(0, partitionSize));

            return;

        }

        if (depthLimit == 0)

        {

            HeapSort(keys.Slice(0, partitionSize));

            return;

        }

        depthLimit--;

        int p = PickPivotAndPartition(keys.Slice(0, partitionSize));

        // Note we've already partitioned around the pivot and do not have to move the pivot again.

        IntroSort(keys[(p+1)..partitionSize], depthLimit);

        partitionSize = p;

    }

}

第一次进入方法时，partitionSize 就是数组的长度, 这里有一个判断条件，如下, IntrosortSizeThreshold 是一个值为16的常量，它是一个阈值, 如果数组的长度小于等于16, 那么使用的就是插入排序(InsertionSort), 为什么是16呢？这里通过注释了解到, 从经验上来看, 16及以下得数组长度使用插入排序的效率是比较高的。

if (partitionSize <= Array.IntrosortSizeThreshold)

{

    if (partitionSize == 2)

    {

        SwapIfGreater(ref keys[0], ref keys[1]);

        return;

    }

    if (partitionSize == 3)

    {

        ref T hiRef = ref keys[2];

        ref T him1Ref = ref keys[1];

        ref T loRef = ref keys[0];

        SwapIfGreater(ref loRef, ref him1Ref);

        SwapIfGreater(ref loRef, ref hiRef);

        SwapIfGreater(ref him1Ref, ref hiRef);

        return;

    }

    InsertionSort(keys.Slice(0, partitionSize));

    return;

}

InsertionSort

如果数组的长度小于等于3时, 直接进行对比交换, 如果长度大约3并且小于等于16的话, 使用插入排序(InsertionSort), 方法内容如下:

https://source.dot.net/#System.Private.CoreLib/ArraySortHelper.cs,537

private static void InsertionSort(Span<T> keys)

{

    for (int i = 0; i < keys.Length - 1; i++)

    {

        T t = Unsafe.Add(ref MemoryMarshal.GetReference(keys), i + 1);

        int j = i;

        while (j >= 0 && (t == null || LessThan(ref t, ref Unsafe.Add(ref MemoryMarshal.GetReference(keys), j))))

        {

            Unsafe.Add(ref MemoryMarshal.GetReference(keys), j + 1) = Unsafe.Add(ref MemoryMarshal.GetReference(keys), j);

            j--;

        }

        Unsafe.Add(ref MemoryMarshal.GetReference(keys), j + 1) = t!;

    }

}

HeapSort

if (depthLimit == 0)

{

    HeapSort(keys.Slice(0, partitionSize));

    return;

}

depthLimit--;

因为后边是递归操作，所以每次 depthLimit 都会减1, 当深度为0排序还没有完成的时候，就会直接使用堆排序(HeapSort)，方法内容如下：

https://source.dot.net/#System.Private.CoreLib/ArraySortHelper.cs,990



private static void HeapSort(Span<TKey> keys, Span<TValue> values)

{

    Debug.Assert(!keys.IsEmpty);

    int n = keys.Length;

    for (int i = n >> 1; i >= 1; i--)

    {

        DownHeap(keys, values, i, n);

    }

    for (int i = n; i > 1; i--)

    {

        Swap(keys, values, 0, i - 1);

        DownHeap(keys, values, 1, i - 1);

    }

}

private static void DownHeap(Span<TKey> keys, Span<TValue> values, int i, int n)

{

    TKey d = keys[i - 1];

    TValue dValue = values[i - 1];

    while (i <= n >> 1)

    {

        int child = 2 * i;

        if (child < n && (keys[child - 1] == null || LessThan(ref keys[child - 1], ref keys[child])))

        {

            child++;

        }

        if (keys[child - 1] == null || !LessThan(ref d, ref keys[child - 1]))

            break;

        keys[i - 1] = keys[child - 1];

        values[i - 1] = values[child - 1];

        i = child;

    }

    keys[i - 1] = d;

    values[i - 1] = dValue;

}

QuickSort

int p = PickPivotAndPartition(keys.Slice(0, partitionSize), values.Slice(0, partitionSize));

IntroSort(keys[(p+1)..partitionSize], values[(p+1)..partitionSize], depthLimit);

partitionSize = p;

这里调用了另外一个方法 PickPivotAndPartition,

Pivot 基准, Partition 分区, 这就是快速排序呀！而且还是使用了尾递归的快速排序，其中也使用了三数取中法，方法内容如下

https://source.dot.net/#System.Private.CoreLib/ArraySortHelper.cs,945

private static int PickPivotAndPartition(Span<TKey> keys, Span<TValue> values)

{

    Debug.Assert(keys.Length >= Array.IntrosortSizeThreshold);

    int hi = keys.Length - 1;

    // Compute median-of-three.  But also partition them, since we've done the comparison.

    int middle = hi >> 1;

    // Sort lo, mid and hi appropriately, then pick mid as the pivot.

    SwapIfGreaterWithValues(keys, values, 0, middle);  // swap the low with the mid point

    SwapIfGreaterWithValues(keys, values, 0, hi);   // swap the low with the high

    SwapIfGreaterWithValues(keys, values, middle, hi); // swap the middle with the high

    TKey pivot = keys[middle];

    Swap(keys, values, middle, hi - 1);

    int left = 0, right = hi - 1;  // We already partitioned lo and hi and put the pivot in hi - 1.  And we pre-increment & decrement below.

    while (left < right)

    {

        if (pivot == null)

        {

            while (left < (hi - 1) && keys[++left] == null) ;

            while (right > 0 && keys[--right] != null) ;

        }

        else

        {

            while (GreaterThan(ref pivot, ref keys[++left])) ;

            while (LessThan(ref pivot, ref keys[--right])) ;

        }

        if (left >= right)

            break;

        Swap(keys, values, left, right);

    }

    // Put pivot in the right location.

    if (left != hi - 1)

    {

        Swap(keys, values, left, hi - 1);

    }

    return left;

}

总结

本文主要介绍了System.Array.Sort<T> 排序的内部实现, 发现它使用了插入排序,堆排序和快速排序，大家有兴趣可以看一下Java或者Golang的排序实现，希望对您有用。