A Go library implementing an FST (finite state transducer)——mark下
https://github.com/couchbaselabs/vellum
Building an FST
To build an FST, create a new builder using the New() method. This method takes an io.Writer as an argument. As the FST is being built, data will be streamed to the writer as soon as possible. With this builder you MUST insert keys in lexicographic order. Inserting keys out of order will result in an error. After inserting the last key into the builder, you MUST call Close() on the builder. This will flush all remaining data to the underlying writer.
In memory:
var buf bytes.Buffer
builder, err := vellum.New(&buf, nil)
if err != nil {
log.Fatal(err)
}
To disk:
f, err := os.Create("/tmp/vellum.fst")
if err != nil {
log.Fatal(err)
}
builder, err := vellum.New(f, nil)
if err != nil {
log.Fatal(err)
}
MUST insert keys in lexicographic order:
err = builder.Insert([]byte("cat"), 1)
if err != nil {
log.Fatal(err)
}
err = builder.Insert([]byte("dog"), 2)
if err != nil {
log.Fatal(err)
}
err = builder.Insert([]byte("fish"), 3)
if err != nil {
log.Fatal(err)
}
err = builder.Close()
if err != nil {
log.Fatal(err)
}
Using an FST
After closing the builder, the data can be used to instantiate an FST. If the data was written to disk, you can use the Open()method to mmap the file. If the data is already in memory, or you wish to load/mmap the data yourself, you can instantiate the FST with the Load() method.
Load in memory:
fst, err := vellum.Load(buf.Bytes())
if err != nil {
log.Fatal(err)
}
Open from disk:
fst, err := vellum.Open("/tmp/vellum.fst")
if err != nil {
log.Fatal(err)
}
Get key/value:
val, exists, err = fst.Get([]byte("dog"))
if err != nil {
log.Fatal(err)
}
if exists {
fmt.Printf("contains dog with val: %d\n", val)
} else {
fmt.Printf("does not contain dog")
}
Iterate key/values:
itr, err := fst.Iterator(startKeyInclusive, endKeyExclusive)
for err == nil {
key, val := itr.Current()
fmt.Printf("contains key: %s val: %d", key, val)
err = itr.Next()
}
if err != nil {
log.Fatal(err)
}
How does the FST get built?
A full example of the implementation is beyond the scope of this README, but let's consider a small example where we want to insert 3 key/value pairs.
First we insert "are" with the value 4.

Next, we insert "ate" with the value 2.

Notice how the values associated with the transitions were adjusted so that by summing them while traversing we still get the expected value.
At this point, we see that state 5 looks like state 3, and state 4 looks like state 2. But, we cannot yet combine them because future inserts could change this.
Now, we insert "see" with value 3. Once it has been added, we now know that states 5 and 4 can longer change. Since they are identical to 3 and 2, we replace them.

Again, we see that states 7 and 8 appear to be identical to 2 and 3.
Having inserted our last key, we call Close() on the builder.

Now, states 7 and 8 can safely be replaced with 2 and 3.
For additional information, see the references at the bottom of this document.
A Go library implementing an FST (finite state transducer)——mark下的更多相关文章
- Finite State Transducers
一, 简介 Finite State Transducers 简称 FST, 中文名:有穷状态转换器.在自然语言处理等领域有很大应用,其功能类似于字典的功能(STL 中的map,C# 中的Dictio ...
- Finite State Machine 是什么?
状态机(Finite State Machine):状态机由状态寄存器和组合逻辑电路构成,能够根据控制信号按照预先设定的状态进行状态转移,是协调相关信号动 作.完成特定操作的控制中心. 类 ...
- Finite State Machine
Contents [hide] 1 Description 2 Components 3 C# - FSMSystem.cs 4 Example Description This is a Dete ...
- 证明与计算(7): 有限状态机(Finite State Machine)
什么是有限状态机(Finite State Machine)? 什么是确定性有限状态机(deterministic finite automaton, DFA )? 什么是非确定性有限状态机(nond ...
- paper:synthesizable finite state machine design techniques using the new systemverilog 3.0 enhancements 之 standard verilog FSM conding styles(二段式)
1.Two always block style with combinational outputs(Good Style) 对应的代码如下: 2段式总结: (1)the combinational ...
- paper:synthesizable finite state machine design techniques using the new systemverilog 3.0 enhancements 之 FSM Coding Goals
1.the fsm coding style should be easily modifiable to change state encoding and FSM styles. FSM 的的 状 ...
- FPGA学习笔记(七)——FSM(Finite State Machine,有限状态机)设计
FPGA设计中,最重要的设计思想就是状态机的设计思想!状态机的本质就是对具有逻辑顺序和时序规律的事件的一种描述方法,它有三个要素:状态.输入.输出:状态也叫做状态变量(比如可以用电机的不同转速作为状态 ...
- paper:synthesizable finite state machine design techniques using the new systemverilog 3.0 enhancements 之 standard verilog FSM conding styles(三段式)
Three always block style with registered outputs(Good style)
- TCP Operational Overview and the TCP Finite State Machine (FSM) http://tcpipguide.com/free/t_TCPOperationalOverviewandtheTCPFiniteStateMachineF.htm
http://tcpipguide.com/free/t_TCPOperationalOverviewandtheTCPFiniteStateMachineF.htm http://tcpipgu ...
随机推荐
- 去掉idea中的警告
目前我使用的两种方法 1.idea右下角有个小人,单击后选择Syntax即可 2.在setting→Editor→Inspections搜索SQL,把No data sources configure ...
- 零基础入门学习Python(5)--闲聊之Python的数据类型
前言 本次主要闲聊一下python的一些数值类型,整型(int),浮点型(float),布尔类型(bool),还有e记法(科学计数法),也是属于浮点型. 数值类型介绍 整型 整型就是我们平时所说的整数 ...
- hadoop_exporter python版本的安装使用
1.需要使用python pip 参考https://www.cnblogs.com/rain124/p/6196053.html python2.7.5 安装pip 1 先安装setuptools ...
- MT6755 平台手机皮套驱动实现
是自己写注册一个input device,模仿keypad,在对应的中断处理函数中上报power key的键值. 具体实现代码如下: 在 alps/kernel-3.10/drivers/misc/m ...
- C++字符串读入
int read() { ,f=;char ch=getchar(); ;ch=getchar();} +ch-';ch=getchar();} return x*f; } int main() { ...
- Spring3.2+mybatis3.2+Struts2.3整合配置文件大全
0.配置文件目录 1.Spring配置 applicationContext-dao.xml <?xml version="1.0" encoding="UTF-8 ...
- vagrant的学习 之 LNMP和LAMP
vagrant的学习 之 LNMP和LAMP 本文根据慕课网的视频教程练习,感谢慕课网! 慕课的参考文档地址:https://github.com/apanly/mooc/tree/master/va ...
- SOJ 3300_Stockholm Coins
[题意]给n个数,求一个数,使这个数能且只能由(n个数每个至少出现一次)表示.输出满足条件的最小的数. [分析](完全背包)如果有满足条件的最小的数,那么这个数只能是这n个数的和total,通过记录每 ...
- POJ 3468_A Simple Problem with Integers(树状数组)
完全不知道该怎么用,看书稍微懂了点. 题意: 给定序列及操作,求区间和. 分析: 树状数组可以高效的求出连续一段元素之和或更新单个元素的值.但是无法高效的给某一个区间的所有元素同时加个值. 不能直接用 ...
- java服务器图片压缩的几种方式及效率比较
以下是测试了三种图片压缩方式,通过测试发现使用jdk的ImageIO压缩时间更短,使用Google的thumbnailator更简单,但是thumbnailator在GitHub上的源码已经停止维护了 ...