Refer to: https://stackoverflow.com/a/10527953

code:

# -*- coding: utf-8 -*-
import numpy as np
from sklearn.pipeline import Pipeline
from sklearn.feature_extraction.text import CountVectorizer
from sklearn.svm import LinearSVC
from sklearn.feature_extraction.text import TfidfTransformer
from sklearn.multiclass import OneVsRestClassifier
from sklearn.preprocessing import MultiLabelBinarizer X_train = np.array(["new york is a hell of a town",
"new york was originally dutch",
"the big apple is great",
"new york is also called the big apple",
"nyc is nice",
"people abbreviate new york city as nyc",
"the capital of great britain is london",
"london is in the uk",
"london is in england",
"london is in great britain",
"it rains a lot in london",
"london hosts the british museum",
"new york is great and so is london",
"i like london better than new york"])
y_train_text = [["new york"],["new york"],["new york"],["new york"],["new york"],
["new york"],["london"],["london"],["london"],["london"],
["london"],["london"],["new york","london"],["new york","london"]] X_test = np.array(['nice day in nyc',
'welcome to london',
'london is rainy',
'it is raining in britian',
'it is raining in britian and the big apple',
'it is raining in britian and nyc',
'hello welcome to new york. enjoy it here and london too'])
target_names = ['New York', 'London'] mlb = MultiLabelBinarizer()
Y = mlb.fit_transform(y_train_text) classifier = Pipeline([
('vectorizer', CountVectorizer()),
('tfidf', TfidfTransformer()),
('clf', OneVsRestClassifier(LinearSVC()))]) classifier.fit(X_train, Y)
predicted = classifier.predict(X_test)
all_labels = mlb.inverse_transform(predicted) for item, labels in zip(X_test, all_labels):
print('{0} => {1}'.format(item, ', '.join(labels)))

Output:

nice day in nyc => new york
welcome to london => london
london is rainy => london
it is raining in britian => london
it is raining in britian and the big apple => new york
it is raining in britian and nyc => london, new york
hello welcome to new york. enjoy it here and london too => london, new york

【Scikit】实现Multi-label text classification代码模板的更多相关文章

  1. [Bayes] Maximum Likelihood estimates for text classification

    Naïve Bayes Classifier. We will use, specifically, the Bernoulli-Dirichlet model for text classifica ...

  2. 论文阅读:《Bag of Tricks for Efficient Text Classification》

    论文阅读:<Bag of Tricks for Efficient Text Classification> 2018-04-25 11:22:29 卓寿杰_SoulJoy 阅读数 954 ...

  3. 论文翻译——Character-level Convolutional Networks for Text Classification

    论文地址 Abstract Open-text semantic parsers are designed to interpret any statement in natural language ...

  4. 论文解读(XR-Transformer)Fast Multi-Resolution Transformer Fine-tuning for Extreme Multi-label Text Classification

    Paper Information Title:Fast Multi-Resolution Transformer Fine-tuning for Extreme Multi-label Text C ...

  5. 给label text 上色 && 给textfiled placeholder 上色

    1.给label text 上色: NSInteger stringLength = ; stringLength = model.ToUserNickName.length; NSMutableAt ...

  6. [转] Implementing a CNN for Text Classification in TensorFlow

    Github上的一个开源项目,文档讲得极清晰 Github - https://github.com/dennybritz/cnn-text-classification-tf 原文- http:// ...

  7. [Tensorflow] RNN - 04. Work with CNN for Text Classification

    Ref: Combining CNN and RNN for spoken language identification Ref: Convolutional Methods for Text [1 ...

  8. Implementing a CNN for Text Classification in TensorFlow

    参考: 1.Understanding Convolutional Neural Networks for NLP 2.Implementing a CNN for Text Classificati ...

  9. 论文列表——text classification

    https://blog.csdn.net/BitCs_zt/article/details/82938086 列出自己阅读的text classification论文的列表,以后有时间再整理相应的笔 ...

随机推荐

  1. java知识思维图解

  2. android:EditText控件

    EditText 是程序用于和用户进行交互的另一个重要控件,它允许用户在控件里输入和编 辑内容,并可以在程序中对这些内容进行处理.EditText 的应用场景应该算是非常普遍了, 发短信.发微博.聊 ...

  3. Go学习入门

    1. 为什么要学习Go Go语言宣称为互联网时代的C语言,那她有那些特性值得我们必须学习呢: 并行与分布式支持.除了我们日常熟悉的进程和线程,Go语言中提供了协程coroutine,从而简化了并行开发 ...

  4. iOS:百度长语音识别具体的封装:识别、播放、进度刷新

    一.介绍 以前做过讯飞语音识别,比较简单,识别率很不错,但是它的识别时间是有限制的,最多60秒.可是有的时候我们需要更长的识别时间,例如朗诵古诗等功能.当然讯飞语音也是可以通过曲线救国来实现,就是每达 ...

  5. 隐藏控件--HiddenField控件

    HiddenField控件百度查的结果(帮助大家对比理解): HiddenField控件顾名思义就是隐藏输入框的服务器控件,它能让你保存那些不需要显示在页面上的且对安全性要求不高的数据.也许这个时候应 ...

  6. VC++网络安全编程范例(11)-SSL高级加密网络通信(转)

    SSL(Secure Sockets Layer 安全套接层),及其继任者传输层安全(Transport Layer Security,TLS)是为网络通信提供安全及数据完整性的一种安全协议.TLS与 ...

  7. Qt只QSetting

    The QSettings class provides persistent platform-independent application settings. 提供跨平台的持久性设置. QSet ...

  8. IIS 之 应用程序池

    IIS(Internet Information Services),由于我使用的是Windows10系统,所以本文以其内置 10.0.14393.0 版本说明. 应用程序池 → 右键(待设置应用程序 ...

  9. 【SqlServer】解析SqlServer的分页

    方式1: 假设页数是10,现在要拿出第5页的内容,查询语句如下: --10代表分页的大小 * from test where id not in ( --40是这么计算出来的:10*(5-1) id ...

  10. OpenCV 学习笔记 04 深度估计与分割——GrabCut算法与分水岭算法

    1 使用普通摄像头进行深度估计 1.1 深度估计原理 这里会用到几何学中的极几何(Epipolar Geometry),它属于立体视觉(stereo vision)几何学,立体视觉是计算机视觉的一个分 ...