A Go library implementing an FST (finite state transducer)——mark下
https://github.com/couchbaselabs/vellum
Building an FST
To build an FST, create a new builder using the New() method. This method takes an io.Writer as an argument. As the FST is being built, data will be streamed to the writer as soon as possible. With this builder you MUST insert keys in lexicographic order. Inserting keys out of order will result in an error. After inserting the last key into the builder, you MUST call Close() on the builder. This will flush all remaining data to the underlying writer.
In memory:
var buf bytes.Buffer
builder, err := vellum.New(&buf, nil)
if err != nil {
log.Fatal(err)
}
To disk:
f, err := os.Create("/tmp/vellum.fst")
if err != nil {
log.Fatal(err)
}
builder, err := vellum.New(f, nil)
if err != nil {
log.Fatal(err)
}
MUST insert keys in lexicographic order:
err = builder.Insert([]byte("cat"), 1)
if err != nil {
log.Fatal(err)
}
err = builder.Insert([]byte("dog"), 2)
if err != nil {
log.Fatal(err)
}
err = builder.Insert([]byte("fish"), 3)
if err != nil {
log.Fatal(err)
}
err = builder.Close()
if err != nil {
log.Fatal(err)
}
Using an FST
After closing the builder, the data can be used to instantiate an FST. If the data was written to disk, you can use the Open()method to mmap the file. If the data is already in memory, or you wish to load/mmap the data yourself, you can instantiate the FST with the Load() method.
Load in memory:
fst, err := vellum.Load(buf.Bytes())
if err != nil {
log.Fatal(err)
}
Open from disk:
fst, err := vellum.Open("/tmp/vellum.fst")
if err != nil {
log.Fatal(err)
}
Get key/value:
val, exists, err = fst.Get([]byte("dog"))
if err != nil {
log.Fatal(err)
}
if exists {
fmt.Printf("contains dog with val: %d\n", val)
} else {
fmt.Printf("does not contain dog")
}
Iterate key/values:
itr, err := fst.Iterator(startKeyInclusive, endKeyExclusive)
for err == nil {
key, val := itr.Current()
fmt.Printf("contains key: %s val: %d", key, val)
err = itr.Next()
}
if err != nil {
log.Fatal(err)
}
How does the FST get built?
A full example of the implementation is beyond the scope of this README, but let's consider a small example where we want to insert 3 key/value pairs.
First we insert "are" with the value 4.

Next, we insert "ate" with the value 2.

Notice how the values associated with the transitions were adjusted so that by summing them while traversing we still get the expected value.
At this point, we see that state 5 looks like state 3, and state 4 looks like state 2. But, we cannot yet combine them because future inserts could change this.
Now, we insert "see" with value 3. Once it has been added, we now know that states 5 and 4 can longer change. Since they are identical to 3 and 2, we replace them.

Again, we see that states 7 and 8 appear to be identical to 2 and 3.
Having inserted our last key, we call Close() on the builder.

Now, states 7 and 8 can safely be replaced with 2 and 3.
For additional information, see the references at the bottom of this document.
A Go library implementing an FST (finite state transducer)——mark下的更多相关文章
- Finite State Transducers
一, 简介 Finite State Transducers 简称 FST, 中文名:有穷状态转换器.在自然语言处理等领域有很大应用,其功能类似于字典的功能(STL 中的map,C# 中的Dictio ...
- Finite State Machine 是什么?
状态机(Finite State Machine):状态机由状态寄存器和组合逻辑电路构成,能够根据控制信号按照预先设定的状态进行状态转移,是协调相关信号动 作.完成特定操作的控制中心. 类 ...
- Finite State Machine
Contents [hide] 1 Description 2 Components 3 C# - FSMSystem.cs 4 Example Description This is a Dete ...
- 证明与计算(7): 有限状态机(Finite State Machine)
什么是有限状态机(Finite State Machine)? 什么是确定性有限状态机(deterministic finite automaton, DFA )? 什么是非确定性有限状态机(nond ...
- paper:synthesizable finite state machine design techniques using the new systemverilog 3.0 enhancements 之 standard verilog FSM conding styles(二段式)
1.Two always block style with combinational outputs(Good Style) 对应的代码如下: 2段式总结: (1)the combinational ...
- paper:synthesizable finite state machine design techniques using the new systemverilog 3.0 enhancements 之 FSM Coding Goals
1.the fsm coding style should be easily modifiable to change state encoding and FSM styles. FSM 的的 状 ...
- FPGA学习笔记(七)——FSM(Finite State Machine,有限状态机)设计
FPGA设计中,最重要的设计思想就是状态机的设计思想!状态机的本质就是对具有逻辑顺序和时序规律的事件的一种描述方法,它有三个要素:状态.输入.输出:状态也叫做状态变量(比如可以用电机的不同转速作为状态 ...
- paper:synthesizable finite state machine design techniques using the new systemverilog 3.0 enhancements 之 standard verilog FSM conding styles(三段式)
Three always block style with registered outputs(Good style)
- TCP Operational Overview and the TCP Finite State Machine (FSM) http://tcpipguide.com/free/t_TCPOperationalOverviewandtheTCPFiniteStateMachineF.htm
http://tcpipguide.com/free/t_TCPOperationalOverviewandtheTCPFiniteStateMachineF.htm http://tcpipgu ...
随机推荐
- mysql高效率随机获取n条数据写法
今天做项目遇到这个问题,本来想用mysql自带的随机函数来实现,但是想到这样做功能是实现了,但是效率真的好差!一下子想不到好的方法,就去网上找了一下,记录下来,好好研究学习一下. ID连续的情况下(注 ...
- 零基础入门学习Python(14)--字符串:各种奇葩的内置方法
前言 这节课我们回过头来,再谈一下字符串,或许我们现在再来谈字符串,有些朋友可能觉得没必要了,甚至有些朋友就会觉得,不就是字符串吗,哥闭着眼也能写出来,那其实关于字符串还有很多你不知道的秘密哦.由于字 ...
- 转:Centos7安装zabbix3.4超详细步骤解析
安装前准备: 1.1 安装依赖包: yum -y install wget net-snmp-devel OpenIPMI-devel httpd openssl-devel java lrzsz f ...
- C++动态申请内存 new T()与new T[]的区别
new与delete 我们知道,new和delete运算符是用于动态分配和撤销内存的运算符. new的用法 开辟单变量地址空间: i. 如 new int ; 指开辟一个存放数组的存储空间,返回一个指 ...
- TCP传输的三次握手四次挥手策略
为了准确无误地数据送达目标处,TCP协议采用了三次握手策略.用TCP协议把数据包送出去后,TCP不会对传送后的情况置之不理,它一定会向对方确认是否成功送达.握手中使用了TCP的标志:SYN和ACK 发 ...
- PS注意点
2.颜色 设计师应该具备审美能力. 3.实验 不断的练习会让你学习到更多的东西,请不要给自己太多压力,你的付出不会仅仅只让你原地踏步,要坚持. 填充和不透明的掌握. 还有流量的使用. 填充是一 ...
- SQLAlchemy(2):多表操作 & 连接方式及原生SQL
一对多:ForeignKey multitb_models.py import datetime from sqlalchemy import create_engine # 引入 创建引擎 from ...
- 2018/2/14 x-pack的学习
x-pack是什么?它能提供的作用如下,下面描述的这些功能都属于x-park:Shield: 提供对数据的 Password-Protect,以及加密通信.基于角色的权限控制,IP 过滤,审计,可以有 ...
- Linux下汇编语言学习笔记42 ---
这是17年暑假学习Linux汇编语言的笔记记录,参考书目为清华大学出版社 Jeff Duntemann著 梁晓辉译<汇编语言基于Linux环境>的书,喜欢看原版书的同学可以看<Ass ...
- Linux下汇编语言学习笔记20 ---
这是17年暑假学习Linux汇编语言的笔记记录,参考书目为清华大学出版社 Jeff Duntemann著 梁晓辉译<汇编语言基于Linux环境>的书,喜欢看原版书的同学可以看<Ass ...