A part of Natural Language Processing (NLP) is processing text by “tokenizing” language strings. This means we can break up a string of text into parts by word, sentence, etc. In this lesson, we will use the natural library to tokenize a string. First, we will break the string into words using WordTokenizerWordPunctTokenizer, and TreebankWordTokenizer. Then we will break the string into sentences using RegexpTokenizer.

var natural = require('natural'),
tokenizer = new natural.WordTokenizer();
console.log(tokenizer.tokenize("your dog has fleas."));
// [ 'your', 'dog', 'has', 'fleas' ]
tokenizer = new natural.TreebankWordTokenizer();
console.log(tokenizer.tokenize("my dog hasn't any fleas."));
// [ 'my', 'dog', 'has', 'n\'t', 'any', 'fleas', '.' ] tokenizer = new natural.RegexpTokenizer({pattern: /\-/});
console.log(tokenizer.tokenize("flea-dog"));
// [ 'flea', 'dog' ] tokenizer = new natural.WordPunctTokenizer();
console.log(tokenizer.tokenize("my dog hasn't any fleas."));
// [ 'my', 'dog', 'hasn', '\'', 't', 'any', 'fleas', '.' ]

[Javascript Natural] Break up language strings into parts using Natural的更多相关文章

  1. (四)JavaScript之[break和continue]与[typeof、null、undefined]

    7].break和continue /** * JavaScript 的break和continue语句 * break 跳出switch()语句 * break 用于跳出循环 * continue ...

  2. [Javascript] Classify JSON text data with machine learning in Natural

    In this lesson, we will learn how to train a Naive Bayes classifier and a Logistic Regression classi ...

  3. javascript 中break、 continue、函数不能重载

    在javascript中,break与continue有着显著的差别. 如果遇到break语句,会终止最内层循环,无论后面还有多少计算. 如果遇到continue,只会终止此次循环,后面的自循环依然执 ...

  4. javascript中break,continue和return语句用法小结:

    Break语句会使程序立刻退出包含在最底层的循环或者退出一个switch语句,它是用来退出循环或者switch语句. 例如: <script type="text/javascript ...

  5. JavaScript Prototype in Plain Language

    非常好的文章: http://javascriptissexy.com/javascript-prototype-in-plain-detailed-language/ jan. 25 2013 14 ...

  6. javascript . 02 break和continue、while、数组、冒泡排序

    1.1 知识点 NaN是number类型 null是object类型 /**  + 回车  多行注释 table 会为内部的tr td 自动补齐闭合标签 1.2 循环结构 1.2.1  Break和c ...

  7. javascript中break和continue的区别

    1.break:跳出循环. 2.continue:跳过循环中的一个迭代.(迭代:重复反馈过程的滑动,其目的是为了逼近所需目标或结果.每一次对过程的重复称为一次"迭代",而每一次迭代 ...

  8. javascript中break与continue,及return的区别

    a).在循环体中, break是跳出整个循环,不执行以后的循环语句: continue是结束本次循环语句,进入下一个循环: b). 在if判断句,结束该函数的执行时,用 return: c). 在函数 ...

  9. javascript中break和continue

    1.break break语句会立即退出循环,强制执行循环后面的语句 var num = 0; for(var i=1;i<10;i++){ if(i%5 == 0){ break; } num ...

随机推荐

  1. 【Codeforces Round #457 (Div. 2) B】Jamie and Binary Sequence

    [链接] 我是链接,点我呀:) [题意] 在这里输入题意 [题解] 把n分解成二进制的形式. n=2^a0+2^a1+...+2^a[q-1] 则固定就是长度为q的序列. 要想扩展为长为k的序列. 可 ...

  2. UVALive 6527 Counting ones dfs(水

    题目链接:点击打开链接 #include <cstdio> #include <vector> using namespace std; typedef long long l ...

  3. 推断一个java文件和邮箱格式是否合法

    import java.util.Scanner; public class StringTest { public static void main(String[] args) { int bac ...

  4. atxserver2安装与使用

    atxserver2的使用 1.首先clone atxserver2代码,此时使用pip3 install requirements后执行python main.py 会提示“ [WinError 1 ...

  5. Windows上安装多个MySQL实例(转)

    在学习和开发过程中有时候会用到多个MySQL数据库,比如Master-Slave集群.分库分表,开发阶段在一台机器上安装多个MySQL实例就显得方便不少. 在 MySQL教程-基础篇-1.1-Wind ...

  6. 1.1 Introduction中 Apache Kafka™ is a distributed streaming platform. What exactly does that mean?(官网剖析)(博主推荐)

    不多说,直接上干货! 一切来源于官网 http://kafka.apache.org/documentation/ Apache Kafka™ is a distributed streaming p ...

  7. Android Unable to execute dex: method ID not in [0, 0xffff]: 65536 问题解决方法

    开始一个新项目的时候,Build工程的时候一直报这个错误: 控制台报错误:Conversion to Dalvik format failed: Unable to execute dex: meth ...

  8. 学习笔记:Vue——处理边界情况

    访问元素&组件 01.访问根实例 $root // Vue 根实例 new Vue({ data: { foo: 1 }, computed: { bar: function () { /* ...

  9. PatentTips - Adaptive algorithm for selecting a virtualization algorithm in virtual machine environments

    BACKGROUND A Virtual Machine (VM) is an efficient, isolated duplicate of a real computer system. Mor ...

  10. 利用mysql5.6 的st_distance 实现按照距离远近排序。 (转载)

    http://blog.csdn.net/zhouzhiwengang/article/details/53612481