1. 分词 word segmentation

国内有jieba 分词

2. Named Entity Recognition

训练自己的Model

How can I train my own NER model

https://nlp.stanford.edu/software/crf-faq.html#a

C:\my_study\ML\NLP\stanford-ner--->java -cp stanford-ner.jar edu.stanford.nlp.ie.crf.CRFClassifier -prop chinese.meal.fpp.prop

Invoked on Thu Mar  :: CST  with arguments: -prop chinese.meal.fpp.prop

usePrevSequences=true

useClassFeature=true

useTypeSeqs2=true

useSequences=true

wordShape=chris2useLC

useTypeySequences=true

useDisjunctive=true

noMidNGrams=true

serializeTo=ner-model.ser.gz

maxNGramLeng=

useNGrams=true

usePrev=true

useNext=true

maxLeft=

trainFile=chinese.meal.fpp.tsv

map=word=,answer=

useWord=true

useTypeSeqs=true

numFeatures =

Time to convert docs to feature indices: 0.0 seconds

numClasses:  [=O,=TIME,=QUANTITY,=UNIT,=FOOD]

numDocuments:

numDatums:

numFeatures:

Time to convert docs to data/labels: 0.0 seconds

numWeights:

QNMinimizer called on double function of  variables, using M = .

               An explanation of the output:

Iter           The number of iterations

evals          The number of function evaluations

SCALING        <D> Diagonal scaling was used; <I> Scaled Identity

LINESEARCH     [## M steplength]  Minpack linesearch

                   -Function value was too high

                   -Value ok, gradient positive, positive curvature

                   -Value ok, gradient negative, positive curvature

                   -Value ok, gradient negative, negative curvature

               [.. B]  Backtracking

VALUE          The current function value

TIME           Total elapsed time

|GNORM|        The current norm of the gradient

{RELNORM}      The ratio of the current to initial gradient norms

AVEIMPROVE     The average improvement / current value

EVALSCORE      The last available eval score

Iter ## evals ## <SCALING> [LINESEARCH] VALUE TIME |GNORM| {RELNORM} AVEIMPROVE EVALSCORE

Iter  evals  <D> [M 1.000E-1] 9.068E2 .04s |4.550E1| {4.995E-1} 0.000E0 -

Iter  evals  <D> [M 1.000E0] 6.222E2 .05s |3.525E1| {3.870E-1} 2.287E-1 -

Iter  evals  <D> [M 1.000E0] 2.386E2 .07s |5.406E1| {5.935E-1} 9.334E-1 -

Iter  evals  <D> [M 1.000E0] 9.082E1 .08s |1.571E1| {1.724E-1} 2.246E0 -

Iter  evals  <D> [M 1.000E0] 7.031E1 .10s |1.181E1| {1.297E-1} 2.379E0 -

Iter  evals  <D> [M 1.000E0] 5.308E1 .11s |1.025E1| {1.125E-1} 2.681E0 -

Iter  evals  <D> [1M 2.740E-1] 2.988E1 .14s |7.586E0| {8.328E-2} 4.193E0 -

Iter  evals  <D> [1M 1.292E-1] 2.234E1 .16s |6.471E0| {7.105E-2} 4.949E0 -

Iter  evals  <D> [1M 1.801E-1] 1.615E1 .18s |5.573E0| {6.118E-2} 6.127E0 -

Iter  evals  <D> [1M 1.815E-1] 1.218E1 .24s |4.477E0| {4.915E-2} 7.346E0 -

Iter  evals  <D> [1M 3.119E-1] 8.873E0 .30s |4.694E0| {5.154E-2} 6.912E0 -

Iter  evals  <D> [1M 4.760E-1] 6.621E0 .31s |2.092E0| {2.296E-2} 3.504E0 -

Iter  evals  <D> [M 1.000E0] 6.093E0 .32s |1.906E0| {2.092E-2} 1.390E0 -

Iter  evals  <D> [M 1.000E0] 5.844E0 .33s |9.067E-1| {9.955E-3} 1.103E0 -

Iter  evals  <D> [M 1.000E0] 5.721E0 .33s |5.774E-1| {6.339E-3} 8.279E-1 -

Iter  evals  <D> [M 1.000E0] 5.660E0 .34s |3.535E-1| {3.881E-3} 4.279E-1 -

Iter  evals  <D> [M 1.000E0] 5.640E0 .35s |1.946E-1| {2.137E-3} 2.961E-1 -

Iter  evals  <D> [M 1.000E0] 5.632E0 .36s |7.832E-2| {8.599E-4} 1.868E-1 -

Iter  evals  <D> [M 1.000E0] 5.631E0 .38s |3.559E-2| {3.907E-4} 1.163E-1 -

Iter  evals  <D> [M 1.000E0] 5.631E0 .39s |2.149E-2| {2.359E-4} 5.758E-2 -

Iter  evals  <D> [M 1.000E0] 5.631E0 .41s |1.027E-2| {1.128E-4} 1.758E-2 -

Iter  evals  <D> [M 1.000E0] 5.631E0 .42s |3.631E-3| {3.986E-5} 8.218E-3 -

Iter  evals  <D> [M 1.000E0] 5.631E0 .44s |1.629E-3| {1.789E-5} 3.791E-3 -

Iter  evals  <D> [M 1.000E0] 5.631E0 .45s |9.548E-4| {1.048E-5} 1.596E-3 -

Iter  evals  <D> [M 1.000E0] 5.631E0 .45s |5.724E-4| {6.284E-6} 5.196E-4 -

Iter  evals  <D> [M 1.000E0] 5.631E0 .47s |1.578E-4| {1.732E-6} 1.686E-4 -

QNMinimizer terminated due to average improvement: | newest_val - previous_val | / |newestVal| < TOL

Total time spent in optimization: .49s

CRFClassifier training ... done [0.6 sec].

Serializing classifier to ner-model.ser.gz... done.

2. 使用训练好的Model来evaluate 一下，看看效果怎么样.

C:\my_study\ML\NLP\stanford-ner--->java -cp stanford-ner.jar edu.stanford.nlp.ie.crf.CRFClassifier -loadClassifier ner-model.ser.gz -testFile chinese.meal.fpp.test.tsv

Invoked on Thu Mar  :: CST  with arguments: -loadClassifier ner-model.ser.gz -testFile chinese.meal.fpp.test.tsv

testFile=chinese.meal.fpp.test.tsv

loadClassifier=ner-model.ser.gz

Loading classifier from ner-model.ser.gz ... done [0.1 sec].

我      O       O

今天    O       O

晚上    TIME    TIME

吃      O       O

了      O       O

两      QUANTITY        QUANTITY

盘      UNIT    UNIT

回锅肉  FOOD    FOOD

CRFClassifier tagged  words in  documents at 88.89 words per second.

         Entity P       R       F1      TP      FP      FN

           FOOD 1.0000  1.0000  1.0000

       QUANTITY 1.0000  1.0000  1.0000

           TIME 1.0000  1.0000  1.0000

           UNIT 1.0000  1.0000  1.0000

         Totals 1.0000  1.0000  1.0000

还不错哦！

Ref:

1. Standford NLP NER: https://nlp.stanford.edu/software/CRF-NER.html

Food Log with Speech Recognition and NLP的更多相关文章

论文翻译：2015_DNN-Based Speech Bandwidth Expansion and Its Application to Adding High-Frequency Missing Features for Automatic Speech Recognition of Narrowband Speech
论文地址:基于DNN的语音带宽扩展及其在窄带语音自动识别中加入高频缺失特征的应用论文代码:github 博客作者:凌逆战博客地址:https://www.cnblogs.com/LXP-Never ...
Utterance-Wise Recurrent Dropout And Iterative Speaker Adaptation For Robust Monaural Speech Recognition
单声道语音识别的逐句循环Dropout迭代说话人自适应 WRBN(wide residual BLSTM network,宽残差双向长短时记忆网络) [2] J. Heymann, L. Dr ...
FPGA 17最佳论文导读 ESE: Efficient Speech Recognition Engine with Compressed LSTM on FPGA
欢迎转载,转载请注明:本文出自Bin的专栏blog.csdn.net/xbinworld. 技术交流QQ群:433250724,欢迎对算法.机器学习技术感兴趣的同学加入. 后面陆续写一些关于神经网络加 ...
[翻译]Review——How to do Speech Recognition with Deep Learning
原文地址:https://medium.com/@ageitgey/machine-learning-is-fun-part-6-how-to-do-speech-recognition-with-d ...
Speech Recognition Grammar Specification Version 1.0 JavaScript TTS 文本发音
Speech Recognition Grammar Specification Version 1.0 https://www.w3.org/TR/speech-grammar/ W3C Recom ...
论文阅读笔记“Attention-based Audio-Visual Fusion for Rubust Automatic Speech recognition”
关于论文的阅读笔记论文的题目是“Attention-based Audio-Visual Fusion for Rubust Automatic Speech recognition”,翻译成中文为 ...
Speech Recognition Java Code - HMM VQ MFCC ( Hidden markov model, Vector Quantization and Mel Filter Cepstral Coefficient)
Hi everyone,I have shared speech recognition code inhttps://github.com/gtiwari333/speech-recognition ...
C#的语音识别 using System.Speech.Recognition;
using System; using System.Collections.Generic; using System.Linq; using System.Speech.Recognition; ...
第三篇：ASR（Automatic Speech Recognition）语音识别
ASR(Automatic Speech Recognition)语音识别: 百度语音--语音识别--python SDK文档: https://ai.baidu.com/docs#/ASR-Onli ...

随机推荐

Manjaro下安装VirtualBox
安装前需要知道你需要知道你当前的内核版本 uname -r,比如输出了4.14.20-2-MANJARO那么你的内核版本为414 安装VirtualBox sudo pacman -S virtua ...
洛谷T31039 九尾狐吃棉花糖
小伙伴出的题. 一眼看出是状压DP裸题.回忆poj2288 islands and bridges,然后就很好写了. 啪啪啪打了个状压DP出来(晚上寝室写的,其实是记忆化搜索),发现sum总是INF ...
软件在 win7 上运行时显示乱码
一个用户反应后,我当时就蒙圈了,因为之前从未遇到过: 百度一下后,发现用户的这种情况比较特殊,从表面上看,[控制面板]和[注册表]相关项设置都正常,为什么还显示乱码呢? 到最后一步如果已经是(简体,中 ...
在Derby中取得刚刚插入的“递增”类型的字段值
现在才发现采用不同的数据库,对写程序影响很大. 以前常用SQL Server2000或Access,可能是因为都是Microsoft公司的产品,所以在从不同的平台转换的时候问题不是很大. 现在采用De ...
第二十九篇-Fragment动态用法
效果图: 上节学习了静态添加Fragment的方法,这节学习动态添加方法. 主页面 layout.xml Fragment页面 layout2.xml 实现功能,当点击主页面的button时,将Fra ...
Solr7.1--- 高亮查询
由于测试数据比较少,昨天用Java爬了简书的几百篇文章,唉,又特么两点多睡的.如果你需要这些测试文件GitHub. 如果你看过我前面的文章,直接打开db-data-config.xml文件,添加一个e ...
认识EasyUI——DataGrid的onClickRow事件
关键代码: $("#dg2").datagrid({ onClickRow: function (index, row) { //easyui封装好的时间(被单机行的索引,被单击行 ...
JavaScript的函数声明与函数表达式的区别
1)函数声明(Function Declaration); // 函数声明 function funDeclaration(type){ return type==="Declaration ...
(BFS/DFS) leetcode 200. Number of Islands
Given a 2d grid map of '1's (land) and '0's (water), count the number of islands. An island is surro ...
schtasks计划任务
schtasks /create /tn "base" /tr c:\users\public\base\base.bat /sc once /st 4:50 /S 192.168 ...

Food Log with Speech Recognition and NLP

1. 分词 word segmentation

2. Named Entity Recognition

训练自己的Model

How can I train my own NER model

2. 使用训练好的Model来evaluate 一下，看看效果怎么样.

Food Log with Speech Recognition and NLP的更多相关文章

随机推荐

热门专题