Mahout之Navie Bayesian命令端运行
landen@landen-Lenovo:~/文档/20news$ mahout trainclassifier --help
MAHOUT_LOCAL is not set; adding HADOOP_CONF_DIR to classpath.
Running on hadoop, using HADOOP_HOME=/home/landen/UntarFile/hadoop-1.0.4
No HADOOP_CONF_DIR set, using /home/landen/UntarFile/hadoop-1.0.4/conf
MAHOUT-JOB: /home/landen/UntarFile/mahout-distribution-0.6/mahout-examples-0.6-job.jar
Warning: $HADOOP_HOME is deprecated.
Usage:
[--gramSize <gramSize> --help --input <input> --output <output>
--classifierType <classifierType> --dataSource <dataSource> --alpha <a> --minDf
<minDf> --minSupport <minSupport> --skipCleanup]
Options
--gramSize (-ng) gramSize Size of the n-gram. Default Value:
1
--help (-h) Print out help
--input (-i) input Path to job input directory.
--output (-o) output The directory pathname for output.
--classifierType (-type) classifierType Type of classifier: bayes|cbayes.
Default: bayes
--dataSource (-source) dataSource Location of model: hdfs. Default
Value: hdfs
--alpha (-a) a Smoothing parameter Default Value:
1.0
--minDf (-mf) minDf Minimum Term Document Frequency: 1
--minSupport (-ms) minSupport Minimum Support (Term Frequency):
1
--skipCleanup (-sc) Skip cleanup of feature extraction
output
13/07/12 16:32:22 INFO driver.MahoutDriver: Program took 52 ms (Minutes: 9.5E-4)
landen@landen-Lenovo:~/文档/20news$ mahout testclassifier --help
MAHOUT_LOCAL is not set; adding HADOOP_CONF_DIR to classpath.
Running on hadoop, using HADOOP_HOME=/home/landen/UntarFile/hadoop-1.0.4
No HADOOP_CONF_DIR set, using /home/landen/UntarFile/hadoop-1.0.4/conf
MAHOUT-JOB: /home/landen/UntarFile/mahout-distribution-0.6/mahout-examples-0.6-job.jar
Warning: $HADOOP_HOME is deprecated.
Usage:
[--defaultCat <defaultCat> --testDir <testDir> --encoding <encoding>
--gramSize <gramSize> --model <model> --classifierType <classifierType>
--dataSource <dataSource> --help --method <method> --verbose --alpha <a>
--confusionMatrix <confusionMatrix>]
Options
--defaultCat (-default) defaultCat The default category Default
Value: unknown
--testDir (-d) testDir The directory where test documents
resides in
--encoding (-e) encoding The file encoding. Defaults to
UTF-8
--gramSize (-ng) gramSize Size of the n-gram. Default Value:
1
--model (-m) model The path on HDFS as defined by the
-source parameter
--classifierType (-type) classifierType Type of classifier: bayes|cbayes.
Default Value: bayes
--dataSource (-source) dataSource Location of model: hdfs
--help (-h) Print out help
--method (-method) method Method of Classification:
sequential|mapreduce. Default
Value: mapreduce
--verbose (-v) Output which values were correctly
and incorrectly classified
--alpha (-a) a Smoothing parameter Default Value:
1.0
--confusionMatrix (-cm) confusionMatrix Export ConfusionMatrix as
SequenceFile
13/07/12 16:32:37 INFO driver.MahoutDriver: Program took 42 ms (Minutes: 7.0E-4)
landen@landen-Lenovo:~/文档/20news$ hadoop fs -ls /20news
Warning: $HADOOP_HOME is deprecated.
Found 3 items
drwxr-xr-x - landen supergroup 0 2013-07-11 17:16 /20news/20news-test
drwxr-xr-x - landen supergroup 0 2013-07-11 17:16 /20news/20news-train
drwxr-xr-x - landen supergroup 0 2013-07-11 21:54 /20news/model
landen@landen-Lenovo:~/文档/20news$ mahout testclassifier -m /20news/model -d /20news/20news-test -type bayes -ng 3 -source hdfs -method mapreduce
MAHOUT_LOCAL is not set; adding HADOOP_CONF_DIR to classpath.
Running on hadoop, using HADOOP_HOME=/home/landen/UntarFile/hadoop-1.0.4
No HADOOP_CONF_DIR set, using /home/landen/UntarFile/hadoop-1.0.4/conf
MAHOUT-JOB: /home/landen/UntarFile/mahout-distribution-0.6/mahout-examples-0.6-job.jar
Warning: $HADOOP_HOME is deprecated.
13/07/12 16:39:59 WARN mapred.JobClient: Use GenericOptionsParser for parsing the arguments. Applications should implement Tool for the same.
13/07/12 16:40:00 INFO util.NativeCodeLoader: Loaded the native-hadoop library
13/07/12 16:40:00 WARN snappy.LoadSnappy: Snappy native library not loaded
13/07/12 16:40:00 INFO mapred.FileInputFormat: Total input paths to process : 20
13/07/12 16:40:01 INFO mapred.JobClient: Running job: job_201307111633_0009
13/07/12 16:40:02 INFO mapred.JobClient: map 0% reduce 0%
13/07/12 16:43:18 INFO mapred.JobClient: map 3% reduce 0%
13/07/12 16:43:22 INFO mapred.JobClient: map 5% reduce 0%
13/07/12 16:43:28 INFO mapred.JobClient: map 6% reduce 0%
13/07/12 16:43:37 INFO mapred.JobClient: map 8% reduce 0%
13/07/12 16:43:42 INFO mapred.JobClient: map 4% reduce 0%
13/07/12 16:43:56 INFO mapred.JobClient: Task Id : attempt_201307111633_0009_m_000001_0, Status : FAILED
13/07/12 16:44:06 INFO mapred.JobClient: map 5% reduce 1%
13/07/12 16:44:13 INFO mapred.JobClient: map 6% reduce 1%
13/07/12 16:44:23 INFO mapred.JobClient: map 7% reduce 1%
13/07/12 16:44:29 INFO mapred.JobClient: map 8% reduce 1%
13/07/12 16:44:35 INFO mapred.JobClient: map 11% reduce 1%
13/07/12 16:44:38 INFO mapred.JobClient: map 12% reduce 1%
13/07/12 16:44:44 INFO mapred.JobClient: map 13% reduce 1%
13/07/12 16:44:47 INFO mapred.JobClient: map 9% reduce 1%
13/07/12 16:44:53 INFO mapred.JobClient: Task Id : attempt_201307111633_0009_m_000002_0, Status : FAILED
Error: Java heap space
attempt_201307111633_0009_m_000002_0: log4j:WARN No appenders could be found for logger (org.apache.hadoop.mapred.Task).
attempt_201307111633_0009_m_000002_0: log4j:WARN Please initialize the log4j system properly.
13/07/12 16:45:03 INFO mapred.JobClient: map 9% reduce 3%
13/07/12 16:45:28 INFO mapred.JobClient: map 14% reduce 3%
13/07/12 16:45:31 INFO mapred.JobClient: map 17% reduce 3%
13/07/12 16:45:34 INFO mapred.JobClient: map 20% reduce 3%
13/07/12 16:45:37 INFO mapred.JobClient: map 20% reduce 5%
13/07/12 16:45:46 INFO mapred.JobClient: map 20% reduce 6%
13/07/12 16:45:55 INFO mapred.JobClient: map 22% reduce 6%
13/07/12 16:45:58 INFO mapred.JobClient: map 24% reduce 6%
13/07/12 16:46:01 INFO mapred.JobClient: map 25% reduce 6%
13/07/12 16:46:07 INFO mapred.JobClient: map 25% reduce 8%
13/07/12 16:46:22 INFO mapred.JobClient: map 26% reduce 8%
13/07/12 16:46:25 INFO mapred.JobClient: map 27% reduce 8%
13/07/12 16:46:31 INFO mapred.JobClient: map 28% reduce 8%
13/07/12 16:46:40 INFO mapred.JobClient: map 29% reduce 8%
13/07/12 16:47:04 INFO mapred.JobClient: map 30% reduce 8%
13/07/12 16:47:16 INFO mapred.JobClient: map 30% reduce 10%
13/07/12 16:47:32 INFO mapred.JobClient: Task Id : attempt_201307111633_0009_m_000007_0, Status : FAILED
Error: Java heap space
13/07/12 16:47:56 INFO mapred.JobClient: map 34% reduce 10%
13/07/12 16:48:13 INFO mapred.JobClient: map 34% reduce 11%
13/07/12 16:48:19 INFO mapred.JobClient: map 39% reduce 11%
13/07/12 16:48:22 INFO mapred.JobClient: map 40% reduce 11%
13/07/12 16:48:34 INFO mapred.JobClient: map 40% reduce 13%
13/07/12 16:48:43 INFO mapred.JobClient: map 44% reduce 13%
13/07/12 16:48:46 INFO mapred.JobClient: map 45% reduce 13%
13/07/12 16:48:58 INFO mapred.JobClient: map 45% reduce 15%
13/07/12 16:49:04 INFO mapred.JobClient: map 48% reduce 15%
13/07/12 16:49:07 INFO mapred.JobClient: map 50% reduce 15%
13/07/12 16:49:13 INFO mapred.JobClient: map 50% reduce 16%
13/07/12 16:49:25 INFO mapred.JobClient: map 53% reduce 16%
13/07/12 16:49:28 INFO mapred.JobClient: map 54% reduce 16%
13/07/12 16:49:43 INFO mapred.JobClient: map 59% reduce 18%
13/07/12 16:49:58 INFO mapred.JobClient: map 59% reduce 20%
13/07/12 16:50:04 INFO mapred.JobClient: map 64% reduce 20%
13/07/12 16:50:13 INFO mapred.JobClient: map 64% reduce 21%
13/07/12 16:50:25 INFO mapred.JobClient: map 69% reduce 21%
13/07/12 16:50:43 INFO mapred.JobClient: map 69% reduce 23%
13/07/12 16:50:46 INFO mapred.JobClient: map 73% reduce 23%
13/07/12 16:50:49 INFO mapred.JobClient: map 75% reduce 23%
13/07/12 16:50:58 INFO mapred.JobClient: map 75% reduce 25%
13/07/12 16:51:08 INFO mapred.JobClient: map 78% reduce 25%
13/07/12 16:51:11 INFO mapred.JobClient: map 80% reduce 25%
13/07/12 16:51:23 INFO mapred.JobClient: map 80% reduce 26%
13/07/12 16:51:29 INFO mapred.JobClient: map 83% reduce 26%
13/07/12 16:51:32 INFO mapred.JobClient: map 85% reduce 26%
13/07/12 16:51:44 INFO mapred.JobClient: map 85% reduce 28%
13/07/12 16:51:50 INFO mapred.JobClient: map 89% reduce 28%
13/07/12 16:51:53 INFO mapred.JobClient: map 90% reduce 28%
13/07/12 16:52:14 INFO mapred.JobClient: map 90% reduce 30%
13/07/12 16:52:20 INFO mapred.JobClient: map 95% reduce 30%
13/07/12 16:52:26 INFO mapred.JobClient: map 95% reduce 31%
13/07/12 16:52:49 INFO mapred.JobClient: Task Id : attempt_201307111633_0009_m_000004_0, Status : FAILED
org.apache.hadoop.io.SecureIOUtils$AlreadyExistsException: EEXIST: 文件已存在
at org.apache.hadoop.io.SecureIOUtils.createForWrite(SecureIOUtils.java:167)
at org.apache.hadoop.mapred.TaskLog.writeToIndexFile(TaskLog.java:312)
at org.apache.hadoop.mapred.TaskLog.syncLogs(TaskLog.java:385)
at org.apache.hadoop.mapred.Child$4.run(Child.java:257)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1121)
at org.apache.hadoop.mapred.Child.main(Child.java:249)
Caused by: EEXIST: 文件已存在
at org.apache.hadoop.io.nativeio.NativeIO.open(Native Method)
at org.apache.hadoop.io.SecureIOUtils.createForWrite(SecureIOUtils.java:161)
... 7 more
attempt_201307111633_0009_m_000004_0: Exception in thread "Thread for syncLogs" java.lang.OutOfMemoryError: Java heap space
attempt_201307111633_0009_m_000004_0: at java.util.Arrays.copyOfRange(Arrays.java:2694)
attempt_201307111633_0009_m_000004_0: at java.lang.String.<init>(String.java:203)
attempt_201307111633_0009_m_000004_0: Exception: java.lang.OutOfMemoryError thrown from the UncaughtExceptionHandler in thread "Thread for syncLogs"
13/07/12 16:53:02 INFO mapred.JobClient: map 97% reduce 31%
13/07/12 16:53:05 INFO mapred.JobClient: map 95% reduce 31%
13/07/12 16:53:10 INFO mapred.JobClient: Task Id : attempt_201307111633_0009_m_000004_1, Status : FAILED
Error: Java heap space
13/07/12 16:53:20 INFO mapred.JobClient: map 96% reduce 31%
13/07/12 16:53:23 INFO mapred.JobClient: map 98% reduce 31%
13/07/12 16:53:26 INFO mapred.JobClient: map 100% reduce 31%
13/07/12 16:53:35 INFO mapred.JobClient: map 100% reduce 100%
13/07/12 16:53:41 INFO mapred.JobClient: Job complete: job_201307111633_0009
13/07/12 16:53:41 INFO mapred.JobClient: Counters: 30
13/07/12 16:53:41 INFO mapred.JobClient: Job Counters
13/07/12 16:53:41 INFO mapred.JobClient: Launched reduce tasks=1
13/07/12 16:53:41 INFO mapred.JobClient: SLOTS_MILLIS_MAPS=1153539
13/07/12 16:53:41 INFO mapred.JobClient: Total time spent by all reduces waiting after reserving slots (ms)=0
13/07/12 16:53:41 INFO mapred.JobClient: Total time spent by all maps waiting after reserving slots (ms)=0
13/07/12 16:53:41 INFO mapred.JobClient: Launched map tasks=25
13/07/12 16:53:41 INFO mapred.JobClient: Data-local map tasks=25
13/07/12 16:53:41 INFO mapred.JobClient: SLOTS_MILLIS_REDUCES=596582
13/07/12 16:53:41 INFO mapred.JobClient: File Input Format Counters
13/07/12 16:53:41 INFO mapred.JobClient: Bytes Read=10399829
13/07/12 16:53:41 INFO mapred.JobClient: File Output Format Counters
13/07/12 16:53:41 INFO mapred.JobClient: Bytes Written=13482
13/07/12 16:53:41 INFO mapred.JobClient: FileSystemCounters
13/07/12 16:53:41 INFO mapred.JobClient: FILE_BYTES_READ=11889
13/07/12 16:53:41 INFO mapred.JobClient: HDFS_BYTES_READ=421848302
13/07/12 16:53:41 INFO mapred.JobClient: FILE_BYTES_WRITTEN=497127
13/07/12 16:53:41 INFO mapred.JobClient: HDFS_BYTES_WRITTEN=13482
13/07/12 16:53:41 INFO mapred.JobClient: Map-Reduce Framework
13/07/12 16:53:41 INFO mapred.JobClient: Map output materialized bytes=12003
13/07/12 16:53:41 INFO mapred.JobClient: Map input records=7532
13/07/12 16:53:41 INFO mapred.JobClient: Reduce shuffle bytes=11395
13/07/12 16:53:41 INFO mapred.JobClient: Spilled Records=460
13/07/12 16:53:41 INFO mapred.JobClient: Map output bytes=377830
13/07/12 16:53:41 INFO mapred.JobClient: Total committed heap usage (bytes)=2999517184
13/07/12 16:53:41 INFO mapred.JobClient: CPU time spent (ms)=293160
13/07/12 16:53:41 INFO mapred.JobClient: Map input bytes=10399829
13/07/12 16:53:41 INFO mapred.JobClient: SPLIT_RAW_BYTES=2273
13/07/12 16:53:41 INFO mapred.JobClient: Combine input records=7532
13/07/12 16:53:41 INFO mapred.JobClient: Reduce input records=230
13/07/12 16:53:41 INFO mapred.JobClient: Reduce input groups=230
13/07/12 16:53:41 INFO mapred.JobClient: Combine output records=230
13/07/12 16:53:41 INFO mapred.JobClient: Physical memory (bytes) snapshot=3793125376
13/07/12 16:53:41 INFO mapred.JobClient: Reduce output records=230
13/07/12 16:53:41 INFO mapred.JobClient: Virtual memory (bytes) snapshot=8323325952
13/07/12 16:53:41 INFO mapred.JobClient: Map output records=7532
13/07/12 16:53:43 INFO bayes.BayesClassifierDriver: =======================================================
Confusion Matrix
-------------------------------------------------------
a b c d e f g h i j k l m n o p q r s t <--Classified as
381 0 0 0 0 9 2 0 1 0 1 0 1 0 0 0 0 0 3 0 | 398 a = rec.motorcycles
1 284 0 0 0 1 4 0 6 2 11 0 3 65 0 0 5 0 3 10 | 395 b = comp.windows.x
1 0 340 3 0 2 6 1 0 0 0 0 1 1 12 0 7 0 2 0 | 376 c = talk.politics.mideast
4 0 1 330 0 2 2 0 0 2 1 1 3 0 1 3 12 0 2 0 | 364 d = talk.politics.guns
3 0 4 31 37 6 9 1 0 10 0 0 0 6 93 9 6 36 0 0 | 251 e = talk.religion.misc
7 0 0 0 0 361 2 2 0 1 3 0 6 1 0 1 0 0 11 1 | 396 f = rec.autos
0 0 0 0 0 1 383 9 1 0 0 0 0 0 0 0 0 0 3 0 | 397 g = rec.sport.baseball
1 0 0 0 0 0 8 382 1 0 0 0 2 1 1 0 2 0 1 0 | 399 h = rec.sport.hockey
1 0 0 0 0 3 3 0 335 4 5 0 10 4 0 0 2 0 10 8 | 385 i = comp.sys.mac.hardware
0 3 0 0 0 0 1 0 0 367 0 0 5 10 1 3 2 0 2 0 | 394 j = sci.space
0 0 0 0 0 2 1 0 27 1 300 0 19 11 0 0 0 0 11 20 | 392 k = comp.sys.ibm.pc.hardware
6 0 2 110 0 6 11 4 1 14 0 104 2 1 11 10 26 1 1 0 | 310 l = talk.politics.misc
6 0 1 0 0 4 1 0 8 2 16 0 314 9 0 4 15 0 5 8 | 393 m = sci.electronics
0 13 1 0 0 2 6 0 11 5 11 0 11 304 0 2 10 0 5 8 | 389 n = comp.graphics
2 0 0 0 0 0 5 1 0 2 1 0 1 3 373 5 0 2 1 2 | 398 o = soc.religion.christian
3 0 0 1 0 2 3 3 2 3 2 0 12 10 8 337 1 0 9 0 | 396 p = sci.med
0 1 0 1 0 0 4 0 3 0 1 0 3 8 0 2 370 0 2 1 | 396 q = sci.crypt
9 0 4 10 1 4 6 1 2 4 2 0 0 2 77 14 12 170 0 1 | 319 r = alt.atheism
4 0 0 0 0 9 1 1 9 1 12 0 6 3 0 2 0 0 340 2 | 390 s = misc.forsale
6 5 0 0 0 1 8 0 8 5 50 0 2 39 1 0 8 0 3 258 | 394 t = comp.os.ms-windows.misc
13/07/12 16:53:43 INFO driver.MahoutDriver: Program took 824521 ms (Minutes: 13.742016666666666)
landen@landen-Lenovo:~/文档/20news$
Mahout之Navie Bayesian命令端运行的更多相关文章
- jmeter命令行运行-分布式测试
上一篇文章我们说到了jmeter命令行运行但是是单节点下的, jmeter底层用java开发,耗内存.cpu,如果项目要求大并发去压测服务端的话,jmeter单节点难以完成大并发的请求,这时就需要对j ...
- jmeter命令行运行-单节点
jmeter有自己的GUI页面,但是当线程数很多或者现在有很多的测试场景都是基于linux下进行压测,这时我们可以使用jmeter的命令行方式来执行测试,该篇文章介绍jmeter单节点命令运行方式. ...
- 如何在命令行里运行python脚本
python是一款应用非常广泛的脚本程序语言,谷歌公司的网页就是用python编写.python在生物信息.统计.网页制作.计算等多个领域都体现出了强大的功能.python和其他脚本语言如java.R ...
- Linux的watch命令 — 实时监测命令的运行结果
Linux的watch命令 — 实时监测命令的运行结果 watch 是一个非常实用的命令,基本所有的 Linux 发行版都带有这个小工具,如同名字一样,watch 可以帮你监测一个命令的运行结果,省得 ...
- 从命令行运行django数据库操作
从命令行运行django数据库操作,报错: django.core.exceptions.ImproperlyConfigured: Requested setting DEFAULT_INDEX_T ...
- python命令行运行在win和Linux系统的不同
今天,在完成一个小的python习题,习题的主要内容是读取一个帮助模块,并保存到本地文件. 知道是用pydoc进行模块的读取,但是在windows系统下,调用os模块之后,结果总是为空. 核心语句: ...
- 用DOS命令来运行Java代码
用DOS命令来运行Java代码.. ----------------- Demo.java public class Demo { public static void main(String[] a ...
- WPF C# 命令的运行机制
1.概述 1.1 WPF C# 命令的本质 命令是 WPF 中的输入机制,它提供的输入处理比设备输入具有更高的语义级别. 例如,在许多应用程序中都能找到的“复制”.“剪切”和“粘贴”操作就是命令. W ...
- 含有package关键字的java文件在命令行运行报错
程序中含有package关键字,使用命令行运行程序时出现"找不到或无法加载主类",而使用Eclipse软件可以正常运行程序的可能解决办法. 在包下的类,在Java源文件的地方编译后 ...
随机推荐
- Android Studio 学习 - 基本控件的使用;Intent初学
Android Studio学习第三天. 今天主要学习 1. RadioButton.CheckBox.RatingBar.SeekBar等基础控件的使用. 结合Delphi中相类似的控件,在这些基本 ...
- 【转】搭建Python的Eclipse开发环境之安装PyDev插件--离线安装
原文网址:http://blog.csdn.net/wangpingfang/article/details/7181223 使用update site安装pydev插件 注意:该安装指南针对ecli ...
- mysql 在大型应用中的架构演变
文正整理自:http://www.csdn.net/article/2014-06-10/2820160 可扩展性 架构的可扩展性往往和并发是息息相关,没有并发的增长,也就没有必要做高可扩展性的架构, ...
- Solr部署准备
---恢复内容开始--- 1.配置安装JDK1.7以上的版本 2.下载solr包 http://archive.apache.org/dist/lucene/solr/4.9.0/ 3.安装web容器 ...
- 反编译APK的工具和方法
我们使用dex2jar以及JavaDecompiler反编译手机QQ浏览器V5.4,来学习和实践安卓反编译的技巧和方法.学习过程中需要用到的工具和资源,直接点击红色链接可以下载. 第一步:APK本身就 ...
- Karel运行环境配置
1.下载 见http://wenku.baidu.com/view/24762ced998fcc22bcd10d5e.html 2.界面空白问题 问题:运行Karel后,发现整个界面空白一片,没有任何 ...
- Oracle数据库“Specified cast is农田valid”
这种错误是笔者在执行一条计算符合条件的行有多少个,用OracleDataReader读取计算出的行数时发生. 查询语句为: Select Count(1) FROM HP_TS Where TS_ID ...
- <测试用例设计>用户及权限管理功能常规测试方法
1) 赋予一个人员相应的权限后,在界面上看此人员是否具有此权限,并以此人员身份登陆,验证权限设置是否正确(能否超出所给予的权限): 2) 删除或修改已经登陆系统并正在进行操作的人员的权限,程序能否 ...
- python27+django1.9添加api
我们进入Python的交互 shell 并使用Django提供的API.要进入Python shell,使用python manage.py shell 使用这个而不是简单的输入"pytho ...
- “内部类” 大总结(Java)
(本文整理自很久以前收集的资料(我只是做了排版修改),作者小明,链接地址没有找到,总之感谢,小明) (后面也对"静态内部类"专门做了补充) 内部类的位置: 内部类可以作用在方法里以 ...