liblinear使用总结
liblinear是libsvm的线性核的改进版本,专门适用于百万数据量的分类。正好适用于我这次数据挖掘的实验。
liblinear用法和libsvm很相似,我是用的是.exe文件,利用python的subprocess向控制台发送命令即可完成本次试验。
其中核心两句即
train train.txt
predict test.txt train.txt.model output.txt
由于是线性核,没有设置参数c、g
对于50W篇文章模型训练仅需340秒,50W篇文章的预测仅需6秒
from subprocess import *
import time time = time.time start_time = time()
print("训练")
cmd = "train train.txt"
Popen(cmd, shell = True, stdout = PIPE).communicate()
print("训练结束",str(time() - start_time)) start_time = time()
print("预测")
cmd = "predict test.txt train.txt.model output.txt"
Popen(cmd, shell = True).communicate()
print("预测结束",str(time() - start_time)) #进行统计
#读测试集真实label
start_time = time()
print("统计")
test_filename = "test.txt"
f = open(test_filename,"r",encoding = "utf-8")
real_class = []
for line in f:
real_class.append(line[0]) #总样本
total_sample = len(real_class) #读预测结果label
predict_filename = "output.txt"
f_predict = open(predict_filename,"r",encoding = "utf-8")
s = f_predict.read()
predict_class = s.split() #对预测正确的文章进行计数
T = 0
for real, predict in zip(real_class,predict_class):
if int(real) == int(predict):
T += 1
accuracy = T / total_sample * 100
print("正确率 为", str(accuracy) + "%") # class_label = ["0","1","2","3","4","5","6","7","8","9"]
num_to_cate = {0:"it",1:"体育",2:"军事",3:"金融",4:"健康",5:"汽车",6:"房产",7:"文化",8:"教育",9:"娱乐"} class_label = ["it","体育","军事","金融","健康","汽车","房产","文化","教育","娱乐"] predict_precision = dict.fromkeys(class_label,1.0)
predict_true = dict.fromkeys(class_label,1.0) predict_recall = dict.fromkeys(class_label,1.0)
predict_F = dict.fromkeys(class_label,0.0)
# print(str(predict_precision))
# print(str(predict_precision))
# print(str(predict_recall))
# print(str(predict_true))
mat = dict.fromkeys(class_label,{})
for k,v in mat.items():
mat[k] = dict.fromkeys(class_label,0) # print(str(mat)) for real, predict in zip(real_class,predict_class):
real = int(real)
predict = int(predict)
# print(num_to_cate[real])
# print(num_to_cate[predict])
mat[num_to_cate[real]][num_to_cate[predict]] += 1
predict_precision[num_to_cate[predict]] += 1
predict_recall[num_to_cate[real]] += 1 if int(real) == int(predict):
predict_true[num_to_cate[predict]] += 1 # print(str(predict_precision))
# print(str(predict_recall))
# print(str(predict_true)) #输出混淆矩阵
for k, v in mat.items():
print(k + ":" + str(v)) #计算精确率和召回率
for x in range(len(class_label)):
# x = str(x)
predict_precision[num_to_cate[x]] = predict_true[num_to_cate[x]] / predict_precision[num_to_cate[x]]
predict_recall[num_to_cate[x]] = predict_true[num_to_cate[x]] / predict_recall[num_to_cate[x]] # print(str(predict_precision))
# print(str(predict_recall))
# print(str(predict_true)) #计算F测度
for x in range(len(class_label)):
# x = str(x)
predict_F[num_to_cate[x]] = 2 * predict_recall[num_to_cate[x]] * predict_precision[num_to_cate[x]] / (predict_precision[num_to_cate[x]] + predict_recall[num_to_cate[x]]) print("统计结束",str(time() - start_time))
print("精确率为",str(predict_precision))
print("召回率为",str(predict_recall))
print("F测度为",str(predict_F)) print("保存结果")
final_result_filename = "./finalresult.txt"
f = open(final_result_filename,"w",encoding = "utf-8")
for k, v in mat.items():
f.write(k + ":" + str(v) + "\n") f.write("\n")
f.write("正确率为" + str(accuracy) + "%" + "\n\n")
f.write("精确率为" + str(predict_precision) + "\n\n")
f.write("召回率为" + str(predict_recall) + "\n\n")
f.write("F测度为" + str(predict_F) + "\n\n")
print("保存结果结束") # cate_to_num = {"it":0,"体育":1,"军事":2,"华人":3,"国内":4,"国际":5,"房产":6,"文娱":7,"社会":8,"财经":9}
# num_to_cate = {0:"it",1:"体育",2:"军事",3:"华人",4:"国内",5:"国际",6:"房产",7:"文娱",8:"社会",9:"财经"}
liblinear使用总结的更多相关文章
- LibLinear(SVM包)使用说明之(一)README
转自:http://blog.csdn.net/zouxy09/article/details/10947323/ LibLinear(SVM包)使用说明之(一)README zouxy09@qq.c ...
- LibLinear(SVM包)使用说明之(三)实践
LibLinear(SVM包)使用说明之(三)实践 LibLinear(SVM包)使用说明之(三)实践 zouxy09@qq.com http://blog.csdn.net/zouxy09 我们在U ...
- LibLinear(SVM包)使用说明之(二)MATLAB接口
LibLinear(SVM包)使用说明之(二)MATLAB接口 LibLinear(SVM包)使用说明之(二)MATLAB接口 zouxy09@qq.com http://blog.csdn.net/ ...
- LibLinear(SVM包)的MATLAB安装
LibLinear(SVM包)的MATLAB安装 1 LIBSVM介绍 LIBSVM是众所周知的支持向量机分类工具包(一些支持向量机(SVM)的开源代码库的链接及其简介),运用方便简单,其中的核函数( ...
- Liblinear and Libsvm-rank训练数据的bash代码
Liblinear and Libsvm-rank训练数据的bash代码: for j in "amazon_mp3" "video_surveillance" ...
- 学习笔记23—window10 64位 python2.7 安装liblinear
最近在使用pythin,因为要使用libsvm,所以到官网去下载libsvm.官网地址为libsvm(https://www.csie.ntu.edu.tw/~cjlin/libsvm/)结果下载下来 ...
- liblinear和libsvm区别
来源于知乎: 1. LibLinear是线性核,LibSVM可以扩展到非线性核(当也能用线性核,但同样在线性核条件下会比LibLinear慢很多).2. 多分类:LibLinear是one vs al ...
- liblinear参数及使用方法(原创)
开发语言:JAVA 开发工具:eclipse (下载地址 http://www.eclipse.org/downloads/) liblinear版本:liblinear-1.94.jar (下载地址 ...
- Libsvm和Liblinear的使用经验谈
原文:http://blog.sina.com.cn/s/blog_5b29caf7010127vh.html Libsvm和Liblinear都是国立台湾大学的Chih-Jen Lin博士开发的,L ...
随机推荐
- linux 添加php gd扩展 (linux添加PHP扩展)
首先最基本的 第一:先安装库 yum -y install libjpeglibjpeg-devel libpng libpng-devel freetype freetype-devel 第二:进入 ...
- frameset的固定放置模式,不能放入<form runat="server">中
<%@ Page Language="C#" AutoEventWireup="true" CodeFile="admin_default.as ...
- QuickStart系列:docker部署之PostgreSQL
mysql --> mariadb --> postgresql 官网简介 https://www.postgresql.org/ 使用的镜像名称 centos/postgresql-96 ...
- CSS--margin塌陷
margin塌陷 解决方法: 1.给父级顶加上一条线,不太合适. 2.bfc block format context 设定bfc后,特定的盒子会遵循另一套语法规则,解决了margin塌陷 触发bfc ...
- h5视频配置
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/ ...
- MyEclipse教程:使用UML创建模块库——第二部分(二)
MyEclipse 在线订购年终抄底促销!火爆开抢>> [MyEclipse最新版下载] UML2建模文件存储在建模存储库中,建模可用于生成Java代码,或者可以从代码生成模型. 本教程介 ...
- webview定位 & native和webview切换
前言:现在的app大都是混合式的native+webview,对于native可以直接用uiautomator定位然后操作元素,但是web就定位不到了 一.先看看使用native定位的 二.定位web ...
- 点击图片或者鼠标放上hover .图片变大. 1)可以使用css中的transition, transform 2) 预先设置一个 弹出div. 3)弹出层 alert ; 4) 浏览器的宽度document.documentElement.clientWidth || document.body.clientWidth
变大: 方法一: 利用css属性. 鼠标放上 hover放大几倍. .kecheng_02_cell_content img { /*width: 100px; height: 133px;*/ wi ...
- javascript动态加载js文件主流浏览器兼容版
一.代码示例: <html> <head> <meta http-equiv="Content-Type" content="text/ht ...
- Java学习笔记11(this,super)
this在构造方法间的使用, public class Person { private String name; private int age; public Person() { //this( ...