http://www.codeproject.com/Articles/298519/Fast-Token-Replacement-in-Csharp

Fast Token Replacement in C#

Introduction

FastReplacer is good for executing many Replace operations on a large string when performance is important.

The main idea is to avoid modifying existing text or allocating new memory every time a string is replaced.

We have designed FastReplacer to help us on a project where we had to generate a large text with a large number of append and replace operations. The first version of the application took 20 seconds to generate the text usingStringBuilder. The second improved version that used the String class took 10 seconds. Then we implementedFastReplacer and the duration dropped to 0.1 seconds.

Using the code

Use of FastReplacer should come intuitively to developers that have used StringBuilder before.

Add classes FastReplacer and FastReplacerSnippet to your C# console application and copy the following code in Program.cs:

 Collapse | Copy Code
using System;
using Omega.Alpha.Common; namespace ConsoleApplication1
{
class Program
{
static void Main(string[] args)
{
// Tokens will be delimited with { and }.
FastReplacer fr = new FastReplacer("{", "}");
fr.Append("{OPTION} OR {ANOTHER}.");
fr.Replace("{ANOTHER}", "NOT {OPTION}");
// The text is now "{OPTION} OR NOT {OPTION}."
fr.Replace("{OPTION}", "TO BE");
// The text is now "TO BE OR NOT TO BE."
Console.WriteLine(fr.ToString());
}
}
}

Note that only properly formatted tokens can be replaced using FastReplacer. This constraint is set to ensure good performance since tokens can be extracted from text in one pass.

Case-insensitive

For a case-insensitive replace operation, set the additional constructor parameter caseSensitive to false (default istrue).

 Collapse | Copy Code
FastReplacer fr = new FastReplacer("{", "}", false); // Case-insensitive
fr.Append("{Token}, {token} and {TOKEN}.");
fr.Replace("{tOkEn}", "x"); // Text is "x, x and x."
Console.WriteLine(fr.ToString());

What is in the package

The algorithm is implemented in two C# classes on .NET Framework 4.0:

  • FastReplacer – Class for generating strings with a fast Replace function.
  • FastReplacerSnippet – Internal class that is used from FastReplacer. No need to use it directly.

The attached solution contains three projects for Visual Studio 2010:

  • Omega.Alpha.Common – A class library with the FastReplacer class and the FastReplacerSnippet class.
  • FastReplacerDemo – Demonstration console application. Contains samples and performance tests used in this article.
  • FastReplacerTest – Unit tests for FastReplacer. All cool features can be seen in these tests.

Performance

Speed

Using String or StringBuilder to replace tokens in large text takes more time because every time a Replacefunction is called, a new string is generated.

These tests were performed with FastReplacerDemo.exe (attached project) on a computer with Windows Experience Index: Processor 5.5, Memory 5.5.

ReplaceCount TextLength FastReplacer String StringBuilder
100 907 0.0008 sec 0.0006 sec 0.0014 sec
300 2707 0.0023 sec 0.0044 sec 0.0192 sec
1000 10008 0.0081 sec 0.0536 sec 1.2130 sec
3000 30008 0.0246 sec 0.4515 sec 43.5499 sec
10000 110009 0.0894 sec 5.9623 sec 1677.5883 sec
30000 330009 0.3649 sec 60.9739 sec Skipped
100000 1200010 1.5461 sec 652.8718 sec Skipped

Memory usage

Memory usage while working with FastReplacer is usually 3 times the memory size of the resulting text. This includes:

  1. Original strings sent to FastReplacer as an argument for AppendReplaceInsertBefore, andInsertAfter functions.
  2. Temporary memory used while generating final text in the FastReplacer.ToString function.
  3. The final generated string.

The algorithm

Using the conventional string.Replace function or StringBuilder.Replace function to generate large text takes O(n*m) time, where n is the number of replace operations that is executed and m is the text length, because a new string is generated every time the function is executed.

This chapter explains that FastReplacer will take O(n*log(n) + m) time for the same task.

Tokens

Tokens are recognized in text every time a new text is added (by the AppendReplaceInsertBefore, andInsertAfter functions). Positions of tokens are kept in a Dictionary for fast retrieval. That way searching for a text to be replaced takes O(n*log(n) + m) time instead of O(n*m). That is good, but the following part of the algorithm has more impact.

Snippets

When the Replace function is called to replace a token with a new string, FastReplacer does not modify existing text. Instead, it keeps track of the position in the original text where the token should be cut out and the new string inserted. Next time the Replace function is called, it will do the same on the original text and in the new strings that were previously inserted. That creates a directed acyclic graph where every node (called FastReplacerSnippet) represents a string with information where it should be inserted in its parent text, and with an array of child nodes that should be inserted in that string.

Every replace operation takes O(log n) time to find the matching token (covered in the previous chapter) and O(1) to insert a new node in the data structure.

Generating the final string takes O(m) time because there is only one pass through the data structure to recursively collect all the parts that need to be concatenated.

Sample 1

For example, in the string “My {pet} likes {food}.”, if token “{pet}” is replaced with “tortoise”, the following data structure will be created:

The final text will be composed by concatenating the text parts “My ”, “tortoise”, and “ likes {food}.”.

Sample 2

A more complex example is the text “{OPTION} OR {ANOTHER}”. If the token “{ANOTHER}” is replaced with “NOT {OPTION}”, then the token “{OPTION}” replaced with “TO BE”, we will get the following structure:

Constraints

When snippets of text are inserted, tokens are searched in every snippet separately. Tokens in every snippet must be properly formatted. A token cannot start in one snippet then end in another.

For example, you cannot insert a text that contains only the beginning of a token (e.g., “Hello {”) then append a text with the end of the token (e.g., “USERNAME}.”). Each of these function calls would fail because the token in each text is not properly formatted.

To ensure maximal consistency, FastReplacer will throw an exception if the inserted text contains an improperly formatted token.

Fast Token Replacement in C#的更多相关文章

  1. Cheatsheet: 2013 09.01 ~ 09.09

    .NET Multi Threaded WebScraping in CSharpDotNetTech .NET Asynchronous Patterns An Overview of Projec ...

  2. 一)如何开始 ehcache ?

    官网地址 http://www.ehcache.org/ 从哪开始 第一步优先下载 http://www.ehcache.org/downloads/ 下载 Ehcache 2.10.0 .tar.g ...

  3. 如何使用VS在SharePont 2013中插入ashx文件

    http://www.lifeonplanetgroove.com/adding-and-deploying-generic-handlers-ashx-to-a-sharepoint-2010-vi ...

  4. ExtJS笔记 Form

    A Form Panel is nothing more than a basic Panel with form handling abilities added. Form Panels can ...

  5. 教你如何用AST语法树对代码“动手脚”

    个推安卓工程师,负责公司移动端项目的架构和开发,主导移动端日志管理平台系统架构和开发工作,熟悉前后端的技术线,参与个推SDK主要业务研发工作,善于解决项目中遇到的痛点问题. 作为程序猿,每天都在写代码 ...

  6. sql改写优化:简单规则重组实现

    我们知道sql执行是一个复杂的过程,从sql到逻辑计划,到物理计划,规则重组,优化,执行引擎,都是很复杂的.尤其是优化一节,更是内容繁多.那么,是否我们本篇要来讨论这个问题呢?答案是否定的,我们只特定 ...

  7. NuGet在创建pack时提示”The replacement token 'author' has no value“问题解决

    在创建pack时出现了“The replacement token 'author' has no value”的错误提示. 解决方法: 1.可能程序没生成过,在解决方案上重新生成解决方案,注意Deb ...

  8. Android学习笔记之Fast Json的使用

    PS:最近这两天发现了Fast Json 感觉实在是强大.. 学习内容: 1.什么是Fast Json 2.如何使用Fast Json 3.Fast Json的相关原理 4.Fast Json的优势, ...

  9. JSON Web Token

    What is JSON Web Token? JSON Web Token (JWT) is an open standard (RFC 7519) that defines a compact a ...

随机推荐

  1. [Unity3D]Unity3D游戏开发Android内嵌视图Unity查看

    ---------------------------------------------------------------------------------------------------- ...

  2. jQuery基础---Ajax进阶

    原文:jQuery基础---Ajax进阶 内容提纲: 1.加载请求 2.错误处理 3.请求全局事件 4.JSON 和 JSONP 5.jqXHR 对象 发文不易,转载请注明出处! 在 Ajax 基础一 ...

  3. 2014阿里实习生面试题——mysql如何实现的索引

    这是2014北京站的两副面孔阿里实习生问题扯在一起: 在MySQL中.索引属于存储引擎级别的概念,不同存储引擎对索引的实现方式是不同的,比方MyISAM和InnoDB存储引擎. MyISAM索引实现: ...

  4. struts2文件下载 <result type="stream">

    <!--struts.xml配置--> <action name="download" class="com.unmi.action.DownloadA ...

  5. PHP jpgraph的一点小提示(附安装方法)

    PHP中的GD库本身是一套很强大的绘图库了,绘制的图像基本可以满足日常要求,但强大规强大,还是不够方便哈,因为强大方便的基于PHP的GD库的jpgraph也就诞生啦! PHP默认是不启用GD库的,因为 ...

  6. YUV格式转换RGB(基于opencv)

    在编写代码将需要处理YUV格从每个视频帧中提取,然后将其保存为图片.有两种常见的方法在线,第一种是通过opencv自带cvCvtColor,可是这样的方法有bug.得到的图片会泛白.另外一种方法是公式 ...

  7. EventBus(事件总线)

    EventBus(事件总线) Guava在guava-libraries中为我们提供了事件总线EventBus库,它是事件发布订阅模式的实现,让我们能在领域驱动设计(DDD)中以事件的弱引用本质对我们 ...

  8. 12个很少被人知道的CSS事实

    之前没有认真的研究过,padding-bottom的值如果是百分比,那么它的实际值是根据父类的宽度来调整的.我还以为是根据这个元素的本身的宽度来定义呢?汗..padding-top/padding-l ...

  9. C++ const关键字用法详解

    1const char*, char const*, char*const的区别问题几乎是C++面试中每次都会有的题目. 事实上这个概念谁都有只是三种声明方式非常相似很容易记混. Bjarne在他的T ...

  10. FLOYD 求最小环

    首先 先介绍一下 FLOYD算法的基本思想   设d[i,j,k]是在只允许经过结点1…k的情况下i到j的最短路长度则它有两种情况(想一想,为什么):最短路经过点k,d[i,j,k]=d[i,k,k- ...