The Swiss Army Knife of Data Structures … in C#
Background:
Created by Ralf Hinze and Ross Paterson in 2004, and based to a large extent on the work of Chris Okasaki on Implicit Recursive Slowdown and Catenable Double-Ended Queus, this data structure, to quote the abstract of the paper introducing Finger Trees, is:
"a functional representation of persistent sequences supporting access to the ends in amortized constant time, and concatenation and splitting in time logarithmic in the size of the smaller piece. Representations achieving these bounds have appeared previously, but 2-3 finger trees are much simpler, as are the operations on them. Further, by defining the split operation in a general form, we obtain a general purpose data structure that can serve as a sequence, priority queue, search tree, priority search queue and more."
Why the finger tree deserves to be called the Swiss knife of data structures can best be explained by again quoting the introduction of the paper:
"The operations one might expect from a sequence abstraction include adding and removing elements at both ends (the deque operations), concatenation, insertion and deletion at arbitrary points, finding an element satisfying some criterion, and splitting the sequence into subsequences based on some property. Many efficient functional implementations of subsets of these operations are known, but supporting more operations efficiently is difficult. The best known general implementations are very complex, and little used.
This paper introduces functional 2-3 finger trees, a general implementation that performs well, but is much simpler than previous implementations with similar bounds. The data structure and its many variations are simple enough that we are able to give a concise yet complete executable description using the functional programming language Haskell (Peyton Jones, 2003). The paper should be accessible to anyone with a basic knowledge of Haskell and its widely used extension to multiple-parameter type classes (Peyton Jones et al., 1997). Although the structure makes essential use of laziness, it is also suitable for strict languages that provide a lazy evaluation primitive."
Efficiency and universality are the two most attractive features of finger trees. Not less important is simplicity, as it allows easy understanding, straightforward implementation and uneventful maintenance.
Stacks support efficient access to the first item of a sequence only, queues and deques support efficient access to both ends, but not to an randomly-accessed item. Arrays allow extremely efficient O(1) access to any of their items, but are poor at inserting, removal, splitting and concatenation. Lists are poor (O(N)) at locating a randomly indexed item.
Remarkably, the finger tree is efficient with all these operations. One can use this single data structure for all these types of operations as opposed to having to use several types of data structures, each most efficient with only some operations.
Note also the words functional and persistent, which mean that the finger tree is an immutable data structure.
In .NET the IList<T> interface specifies a number of void methods, which change the list in-place (so the instance object is mutable). To implement an immutable operation one needs first to make a copy of the original structure (List<T>, LinkedList<T>, …, etc). An achievement of .NET 3.5 and LINQ is that the set of new extension methods (of theEnumerable class) implement immutable operations.
In the year 2008, Finger Tree implementations have been known only in a few programming languages: in Haskell, in OCaml, and in Scala. At least this is what the popular search engines say.
What about a C# implementation? In February Eric Lippert had a post in his blogabout finger trees. The C# code he provided does not implement all operations of a Finger Tree and probably this is the reason why this post is referred to by the Wikipedia only as "Example of 2-3 trees in C#", but not as an implementation of the Finger Tree data structure. Actually, he did have a complete implementation at that time (see the Update at the start of this post), but desided not to publish it.
My modest contribution is what I believe to be the first published complete C# implementation of the Finger Tree data structure as originally defined in the paper by Hinze and Paterson (only a few exercises have not been implemented).
Programming a Finger Tree in C# was as much fun as challenge. The finger tree structure is defined in an extremely generic way. At first I even was concerned that C# might not be sufficiently expressive to implement such rich genericity. It turned out that C# lived up to the challenge perfectly. Here is a small example of how the code uses multiple types and nested type constraints:
// Types:
// U — the type of Containers that can be split
// T — the type of elements in a container of type U
// V — the type of the Measure-value when an element is measured
public class Split<U, T, V>
where U : ISplittable<T, V>
where T : IMeasured<V>
{
// ………………………………………………….
}
Another challenge was to implement lazy evaluation (the .NET term for this is "deferred execution") for some of the methods. Again, C# was up to the challenge with its IEnumerable interface and the ease and finesse of using the "yield return" statement.
The net result: it was possible to write code like this:
public override IEnumerable<T> ToSequence()
{
ViewL<T, M> lView = LeftView();
yield return lView.head;
foreach (T t in lView.ftTail.ToSequence())
yield return t;
}
Another challenge, of course, was that one definitely needs to understand Hinze’s and Ross’ article before even trying to start the design of an implementation. While the text should be straightforward to anyone with some Haskell and functional programming experience, it requires a bit of concentration and some very basic understanding of fundamental algebraic concepts. In the text of the article one will find a precise and simple definition of a Monoid. My first thought was that such academic knowledge would not really be necessary for a real-world programming task. Little did I know… It turned out that the Monoid plays a central role in the generic specification of objects that have a Measure.
I was thrilled to code my own version of a monoid in C#:
public class Monoid<T>
{
T theZero;
public delegate T monOp(T t1, T t2);
public monOp theOp;
public Monoid(T tZero, monOp aMonOp)
{
theZero = tZero;
theOp = aMonOp;
}
public T zero
{
get
{
return theZero;
}
}
}
Without going into too-much details, here is how the correct Monoids are defined in suitable auxiliary classes to be used in defining a Random-Access Sequence, Priority Queue and Ordered Sequence:
public static class Size
{
public static Monoid<uint> theMonoid =
new Monoid<uint>(0, new Monoid<uint>.monOp(anAddOp));
public static uint anAddOp(uint s1, uint s2)
{
return s1 + s2;
}
}
public static class Prio
{
public static Monoid<double> theMonoid =
new Monoid<double>
(double.NegativeInfinity,
new Monoid<double>.monOp(aMaxOp)
);
public static double aMaxOp(double d1, double d2)
{
return (d1 > d2) ? d1 : d2;
}
}
public class Key<T, V> where V : IComparable
{
public delegate V getKey(T t);
// maybe we shouldn’t care for NoKey, as this is too theoretic
public V NoKey;
public getKey KeyAssign;
public Key(V noKey, getKey KeyAssign)
{
this.KeyAssign = KeyAssign;
}
}
public class KeyMonoid<T, V> where V : IComparable
{
public Key<T, V> KeyObj;
public Monoid<V> theMonoid;
public V aNextKeyOp(V v1, V v2)
{
return (v2.CompareTo(KeyObj.NoKey) == 0) ? v1 : v2;
}
//constructor
public KeyMonoid(Key<T, V> KeyObj)
{
this.KeyObj = KeyObj;
this.theMonoid =
new Monoid<V>(KeyObj.NoKey,
new Monoid<V>.monOp(aNextKeyOp)
);
}
}
Yet another challenge was to be able to create methods dynamically, as currying was essentially used in the specification of finger trees with measures. Once again it was great to make use of the existing .NET 3.5 infrastructure. Below is my simple FP static class, which essentially uses the .NET 3.5 Func object and a lambda expressionin order to implement currying:
public static class FP
{
public static Func<Y, Z> Curry<X, Y, Z>
(this Func<X, Y, Z> func, X x)
{
return (y) => func(x, y);
}
}
And here is a typical usage of the currying implemented above:
public T ElemAt(uint ind)
{
return treeRep.Split
(new MPredicate<uint>
(
FP.Curry<uint, uint, bool>(theLessThanIMethod2, ind)
),
0
).splitItem.Element;
}
Now, for everyone who have reached this point of my post, here is the link to the complete implementation.
Be reminded once again that .NET 3.5 is needed for a successful build.
In my next posts I will analyze the performance of this Finger Tree implementation and how it fares compared to existing implementations of sequential data structures as provided by different programming languages and environments.
The Swiss Army Knife of Data Structures … in C#的更多相关文章
- A library of generic data structures
A library of generic data structures including a list, array, hashtable, deque etc.. https://github. ...
- 剪短的python数据结构和算法的书《Data Structures and Algorithms Using Python》
按书上练习完,就可以知道日常的用处啦 #!/usr/bin/env python # -*- coding: utf-8 -*- # learn <<Problem Solving wit ...
- Persistent Data Structures
原文链接:http://www.codeproject.com/Articles/9680/Persistent-Data-Structures Introduction When you hear ...
- Go Data Structures: Interfaces
refer:http://research.swtch.com/interfaces Go Data Structures: Interfaces Posted on Tuesday, Decembe ...
- Choose Concurrency-Friendly Data Structures
What is a high-performance data structure? To answer that question, we're used to applying normal co ...
- 无锁数据结构(Lock-Free Data Structures)
一个星期前,我写了关于SQL Server里闩锁(Latches)和自旋锁(Spinlocks)的文章.2个同步原语(synchronization primitives)是用来保护SQL Serve ...
- [CareerCup] 10.2 Data Structures for Large Social Network 大型社交网站的数据结构
10.2 How would you design the data structures for a very large social network like Facebook or Linke ...
- Manipulating Data Structures
Computer Science An Overview _J. Glenn Brookshear _11th Edition We have seen that the way data struc ...
- Objects and Data Structures
Date Abstraction Hiding implementation is not just a matter of putting a layer of fucntions between ...
随机推荐
- $\mathscr{F}$类
$\mathscr{F}$类:在单位元盘$B(0,1)$中满足$$f(0)=0,f'(0)=1$$ 的双全纯函数的全体.
- Linux安装卸载Mysql数据库
关于mysql数据库在Linux下的应用一直以来都是我认为比较棘手的,这次通过搭建Linux学习环境顺便研究和学习Mysql数据库在Linux下安装和卸载. 1.先来看看卸载吧,如下图所示: 以上的命 ...
- 使用国内pypi源来安装python包
国内源 http://pypi.douban.com/ 豆瓣 http://pypi.hustunique.com/ 华中理工大学 http://pypi.sdutlinux.org/ 山东理工 ...
- [BI项目记]-BUG创建
BUG是在项目过程中以及运维过程中经常遇到的工作项.在处理每一个BUG的过程中,通过项目管理系统把BUG相应的内容纪录下来也是很重要的.这里将介绍如何通过TFS来完成BUG的创建工作. 首先我们来看B ...
- Redis 慢速入门(一)
网上关于redis的入门文章其实已经很多了,这里仅仅以作者特独的视角来学习下redis相关的基础概念. 一切的基础 需要分清楚3个重要的概念,key,type,value. 这里的key为hello, ...
- 隐藏进程中的模块绕过IceSword的检测
标 题: [原创] 隐藏进程中的模块绕过IceSword的检测 作 者: xPLK 时 间: 2008-06-19,17:59:11 链 接: http://bbs.pediy.com/showthr ...
- 【转】http头部详解
原地址:http://www.cnblogs.com/ziwuge/archive/2011/09/27/2193385.html HTTP 头部解释 1. Accept:告诉WEB服务器自己接受什么 ...
- c:out标签和el表达式与跨域攻击XSS
很多时候,在JSP中我们喜欢用EL表达式输出信息,但是最近发现这个确实存在个问题:XSS即跨域攻击. 下面看个例子: <c:out value="${student.name}&quo ...
- 【刷题记录】GCJ 2.71~2.72
GCJ 271 [题目大意] Minimum Scalar Product 有两个东西(滑稽)v1=(x1,x2,x3,……,xn)和v2=(y1,y2,……yn),允许任意交换v1和v2中各数字的顺 ...
- TFS 分支导致nuget项目依赖丢失
问题: 项目的代码 在tfs上分支后,签出项目.编译时发现无法编译,原有的nuget来的包的dll都丢失了(项目签入时,默认会忽略dll) 在网上找了下,发现一个简单的解决方法: 在"程序包 ...