https://github.com/couchbaselabs/vellum

Building an FST

To build an FST, create a new builder using the New() method. This method takes an io.Writer as an argument. As the FST is being built, data will be streamed to the writer as soon as possible. With this builder you MUST insert keys in lexicographic order. Inserting keys out of order will result in an error. After inserting the last key into the builder, you MUST call Close() on the builder. This will flush all remaining data to the underlying writer.

In memory:

  var buf bytes.Buffer
builder, err := vellum.New(&buf, nil)
if err != nil {
log.Fatal(err)
}

To disk:

  f, err := os.Create("/tmp/vellum.fst")
if err != nil {
log.Fatal(err)
}
builder, err := vellum.New(f, nil)
if err != nil {
log.Fatal(err)
}

MUST insert keys in lexicographic order:

err = builder.Insert([]byte("cat"), 1)
if err != nil {
log.Fatal(err)
} err = builder.Insert([]byte("dog"), 2)
if err != nil {
log.Fatal(err)
} err = builder.Insert([]byte("fish"), 3)
if err != nil {
log.Fatal(err)
} err = builder.Close()
if err != nil {
log.Fatal(err)
}

Using an FST

After closing the builder, the data can be used to instantiate an FST. If the data was written to disk, you can use the Open()method to mmap the file. If the data is already in memory, or you wish to load/mmap the data yourself, you can instantiate the FST with the Load() method.

Load in memory:

  fst, err := vellum.Load(buf.Bytes())
if err != nil {
log.Fatal(err)
}

Open from disk:

  fst, err := vellum.Open("/tmp/vellum.fst")
if err != nil {
log.Fatal(err)
}

Get key/value:

  val, exists, err = fst.Get([]byte("dog"))
if err != nil {
log.Fatal(err)
}
if exists {
fmt.Printf("contains dog with val: %d\n", val)
} else {
fmt.Printf("does not contain dog")
}

Iterate key/values:

  itr, err := fst.Iterator(startKeyInclusive, endKeyExclusive)
for err == nil {
key, val := itr.Current()
fmt.Printf("contains key: %s val: %d", key, val)
err = itr.Next()
}
if err != nil {
log.Fatal(err)
}

How does the FST get built?

A full example of the implementation is beyond the scope of this README, but let's consider a small example where we want to insert 3 key/value pairs.

First we insert "are" with the value 4.

Next, we insert "ate" with the value 2.

Notice how the values associated with the transitions were adjusted so that by summing them while traversing we still get the expected value.

At this point, we see that state 5 looks like state 3, and state 4 looks like state 2. But, we cannot yet combine them because future inserts could change this.

Now, we insert "see" with value 3. Once it has been added, we now know that states 5 and 4 can longer change. Since they are identical to 3 and 2, we replace them.

Again, we see that states 7 and 8 appear to be identical to 2 and 3.

Having inserted our last key, we call Close() on the builder.

Now, states 7 and 8 can safely be replaced with 2 and 3.

For additional information, see the references at the bottom of this document.

A Go library implementing an FST (finite state transducer)——mark下的更多相关文章

  1. Finite State Transducers

    一, 简介 Finite State Transducers 简称 FST, 中文名:有穷状态转换器.在自然语言处理等领域有很大应用,其功能类似于字典的功能(STL 中的map,C# 中的Dictio ...

  2. Finite State Machine 是什么?

    状态机(Finite State Machine):状态机由状态寄存器和组合逻辑电路构成,能够根据控制信号按照预先设定的状态进行状态转移,是协调相关信号动       作.完成特定操作的控制中心. 类 ...

  3. Finite State Machine

    Contents [hide]  1 Description 2 Components 3 C# - FSMSystem.cs 4 Example Description This is a Dete ...

  4. 证明与计算(7): 有限状态机(Finite State Machine)

    什么是有限状态机(Finite State Machine)? 什么是确定性有限状态机(deterministic finite automaton, DFA )? 什么是非确定性有限状态机(nond ...

  5. paper:synthesizable finite state machine design techniques using the new systemverilog 3.0 enhancements 之 standard verilog FSM conding styles(二段式)

    1.Two always block style with combinational outputs(Good Style) 对应的代码如下: 2段式总结: (1)the combinational ...

  6. paper:synthesizable finite state machine design techniques using the new systemverilog 3.0 enhancements 之 FSM Coding Goals

    1.the fsm coding style should be easily modifiable to change state encoding and FSM styles. FSM 的的 状 ...

  7. FPGA学习笔记(七)——FSM(Finite State Machine,有限状态机)设计

    FPGA设计中,最重要的设计思想就是状态机的设计思想!状态机的本质就是对具有逻辑顺序和时序规律的事件的一种描述方法,它有三个要素:状态.输入.输出:状态也叫做状态变量(比如可以用电机的不同转速作为状态 ...

  8. paper:synthesizable finite state machine design techniques using the new systemverilog 3.0 enhancements 之 standard verilog FSM conding styles(三段式)

    Three always block style with registered outputs(Good style)

  9. TCP Operational Overview and the TCP Finite State Machine (FSM) http://tcpipguide.com/free/t_TCPOperationalOverviewandtheTCPFiniteStateMachineF.htm

    http://tcpipguide.com/free/t_TCPOperationalOverviewandtheTCPFiniteStateMachineF.htm   http://tcpipgu ...

随机推荐

  1. VIM命令大全(图+文)

    在命令状态下对当前行用== (连按=两次), 或对多行用n==(n是自然数)表示自动缩进从当前行起的下面n行.你可以试试把代码缩进任意打乱再用n==排版,相当于一般IDE里的code format.使 ...

  2. Servlet中的几个重要的对象(转)

    讲解四大类,ServletConfig对象,ServletContext对象.request对象,response对象 ServletConfig对象 获取途径:getServletConfig(); ...

  3. [Python3网络爬虫开发实战] 7.3-Splash负载均衡配置

    用Splash做页面抓取时,如果爬取的量非常大,任务非常多,用一个Splash服务来处理的话,未免压力太大了,此时可以考虑搭建一个负载均衡器来把压力分散到各个服务器上.这相当于多台机器多个服务共同参与 ...

  4. LeetCode(70) Climbing Stairs

    题目 You are climbing a stair case. It takes n steps to reach to the top. Each time you can either cli ...

  5. jquery源码——noConflict实现

    实现方式很简单:在初始化的时候,记录当前全局中jQuery和$两个变量的的值,用_jQuery和_$分别存放,调用noConflict方法时,使用_jQuery和_$分别恢复对应的值,并且返回jQue ...

  6. Lucene的分词_中文分词器介绍

    Paoding:庖丁解牛分词器.已经没有更新了. MMSeg:搜狗的词库. MMSeg分词器的一些截图: 步骤: 1.导入包 2.创建的时候使用MMSegAnalyzer分词器

  7. 转盘抽奖 canvas & 抽奖 H5 源码

    转盘抽奖 canvas https://github.com/givebest/wechat-turntalbe-canvas https://blog.givebest.cn/GB-canvas-t ...

  8. 调用系统文件管理器选择图片,调用系统裁剪AIP对图片处理,显示裁剪之后的图片

    package com.pingyijinren.test; import android.annotation.TargetApi; import android.app.Notification; ...

  9. Delphi简单的数据操作类

    unit MyClass; uses   Windows, Messages, SysUtils, Classes, Graphics, Controls, Forms, Dialogs,   VCL ...

  10. hdu4701 Game(递推博弈)

    题意: Alice初始有A元,Bob有B元. 有N个物品,第i个物品价值为Ci.Alice和Bob轮流买一些(>=1)物品.不能移动的人输.购买有一个限制,对于第1 个之后物品,只有当第i-1个 ...