今天,在做canopy算法实例时,遇到这个问题,所以记录下来。下面是源码:

  

package czx.com.mahout;

import java.io.IOException;

import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.LongWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.mapreduce.Mapper;
import org.apache.hadoop.mapreduce.Reducer;
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
import org.apache.hadoop.mapreduce.lib.output.SequenceFileOutputFormat;
import org.apache.hadoop.util.ToolRunner;
import org.apache.mahout.common.AbstractJob;
import org.apache.mahout.math.RandomAccessSparseVector;
import org.apache.mahout.math.VectorWritable; public class TextVecWrite extends AbstractJob { public static void main(String[] args) throws Exception {
ToolRunner.run(new Configuration(), new TextVecWrite(), args);
} /**
* TextVecWriterMapper
* @author czx
*;
*/
public static class TextVecWriterMapper extends Mapper<LongWritable, Text, LongWritable,VectorWritable >{
@SuppressWarnings("unchecked")
@Override
protected void map(LongWritable key, Text value,
@SuppressWarnings("rawtypes") org.apache.hadoop.mapreduce.Mapper.Context context)
throws IOException, InterruptedException {
String[] split = value.toString().split("\\s{1,}");
RandomAccessSparseVector vector = new RandomAccessSparseVector(split.length);
for(int i=0;i<split.length;++i){
vector.set(i, Double.parseDouble(split[i]));
}
VectorWritable vectorWritable = new VectorWritable(vector);
context.write(key, vectorWritable);
}
} /**
* TextVectorWritableReducer
* @author czx
*
*/
public static class TextVectorWritableReducer extends Reducer<LongWritable, VectorWritable, LongWritable , VectorWritable >{
@Override
protected void reduce(LongWritable arg0, Iterable<VectorWritable> arg1,
Context arg2)
throws IOException, InterruptedException {
for(VectorWritable v:arg1){
arg2.write(arg0, v);
}
}
} @Override
public int run(String[] arg0) throws Exception {
addInputOption();
addOutputOption();
if(parseArguments(arg0)==null){
return -1;
}
Path input = getInputPath();
Path output = getOutputPath();
Configuration conf = getConf(); Job job = new Job(conf,"textvectorWritable with input:"+input.getName());
job.setOutputFormatClass(SequenceFileOutputFormat.class);
job.setMapperClass(TextVecWriterMapper.class);
job.setReducerClass(TextVectorWritableReducer.class);
job.setMapOutputKeyClass(LongWritable.class);
job.setOutputValueClass(VectorWritable.class);
job.setOutputKeyClass(LongWritable.class);
job.setOutputValueClass(VectorWritable.class);
job.setJarByClass(TextVecWrite.class); FileInputFormat.addInputPath(job, input);
SequenceFileOutputFormat.setOutputPath(job, output); if(!job.waitForCompletion(true)){
throw new InterruptedException("Canopy Job failed processing "+input);
}
return 0;
}
}

  将程序编译打包成JAR并运行如下:

  

 hadoop jar ClusteringUtils.jar czx.com.mahout.TextVecWrite -i /user/hadoop/testdata/synthetic_control.data -o /home/czx/1

  但出现如下错误:

  

Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/commons/cli2/Option
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:270)
at org.apache.hadoop.util.RunJar.main(RunJar.java:205)
Caused by: java.lang.ClassNotFoundException: org.apache.commons.cli2.Option
at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
... 3 more

  最后,发现是将mahout根目录下的相应的jar包复制到hadoop-2.4.1/share/hadoop/common/lib文件夹下时,少复制了mahout-core-0.9-job.jar,于是复制mahout-core-0.9-job.jar后,重新启动hadoop即可。

  

Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/commons/cli2/Option的更多相关文章

  1. Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/commons/logging/LogFactory

    学习架构探险,从零开始写Java Web框架时,在学习到springAOP时遇到一个异常: "C:\Program Files\Java\jdk1.7.0_40\bin\java" ...

  2. MyEclipse8.5集成Tomcat7时的启动错误:Exception in thread “main” java.lang.NoClassDefFoundError org/apache/commons/logging/LogFactory

    今天,安装Tomcat7.0.21后,单独用D:\apache-tomcat-7.0.21\bin\startup.bat启动web服务正常.但在MyEclipse8.5中集成配置Tomcat7后,在 ...

  3. MyEclipse8.5集成Tomcat7时的启动错误:Exception in thread “main” java.lang.NoClassDefFoundError org/apache/commons/logging/LogFactory

    今天,安装Tomcat7.0.21后,单独用D:\apache-tomcat-7.0.21\bin\startup.bat启动web服务正常.但 在MyEclipse8.5中集成配置Tomcat7后, ...

  4. 【转】MyEclipse8.5集成Tomcat7时的启动错误:Exception in thread “main” java.lang.NoClassDefFoundError org/apache/commons/logging/LogFactory

    http://www.cnblogs.com/newsouls/p/4021198.html 今天,安装Tomcat7.0.21后,单独用D:\apache-tomcat-7.0.21\bin\sta ...

  5. shiro报错SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder".和Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/commons/logging/LogFactory

    未能加载类"org.slf4j.impl.StaticLoggerBinder" 解决方案: <dependency> <groupId>org.slf4j ...

  6. spark-shell报错:Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/hadoop/fs/FSDataInputStream

    环境: openSUSE42.2 hadoop2.6.0-cdh5.10.0 spark1.6.0-cdh5.10.0 按照网上的spark安装教程安装完之后,启动spark-shell,出现如下报错 ...

  7. 报错:Exception in thread "main" java.lang.NoClassDefFoundError: Lorg/apache/hadoop/fs/FileSystem

    报错现象: Exception in thread "main" java.lang.NoClassDefFoundError: Lorg/apache/hadoop/fs/Fil ...

  8. Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/hadoop/fs/CanUnbuffer

    在执行spark on hive 的时候在  sql.show()处报错 : Exception in thread "main" java.lang.NoClassDefFoun ...

  9. Exception in thread main java.lang.NoClassDefFoundError: org/apache/juli/logging/LogFacto

    报错: Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/juli/logging/Log ...

随机推荐

  1. UVALive 3211 : Now or later 【2-SAT】

    题目链接 题意及题解参见lrj训练指南 #include<bits/stdc++.h> using namespace std; ; struct TwoSAT { int n; vect ...

  2. 算法——二分法实现sqrt

    public class Solution { public double mySqrt(double n, double accuracy) { double mid = n/2.0; double ...

  3. VS2017 + EF + MySQL 环境配置

    我使用过程中遇到的坑(血泪啊) 安装环境VS2017MVC+WIN10+EF6+MySQL8.0.12 1.安装MySQL connector一定要6.10.8,8.0以上全是坑,会闪退!!! 2.安 ...

  4. python中的时间模块

    废话不多说,看代码 import datetime,time import calendar #时间戳 t1 = time.time() print('当前时间戳是{}'.format(t1)) #格 ...

  5. angular-seed — AngularJS种子项目

    AngularJS Seed 是典型 AngularJS web 应用的应用骨架,可以快速启动你的 AngularJS webapp 项目和这些项目的开发环境. AngularJS Seed 包括一个 ...

  6. 浏览器主页在不知情的情况下设置为duba.com和newduba.cn

    原来是安装了“驱动精灵”. 真是个垃圾! 不通知用户的情况下,自动给锁定主页. 真TMD恶心 离倒闭不远了,现在只能通过这种方式来获取流量.

  7. (47)LINUX应用编程和网络编程之二Linux文件属性

    Linux下的文件系统为树形结构,入口为/ 树形结构下的文件目录: 无论哪个版本的Linux系统,都有这些目录,这些目录应该是标准的.各个Linux发行版本会存在一些小小的差异,但总体来说,还是大体差 ...

  8. HDU1847--Good Luck in CET-4 Everybody!(SG函数)

    Problem Description 大学英语四级考试就要来临了,你是不是在紧张的复习?也许紧张得连短学期的ACM都没工夫练习了,反正我知道的Kiki和Cici都是如此.当然,作为在考场浸润了十几载 ...

  9. [C++基础] 纯虚函数

    整理摘自https://blog.csdn.net/ithomer/article/details/6031329 1. 申明格式 class CShape { public: ; }; 在普通的虚函 ...

  10. 8.Python标识符命名规范

    简单地理解,标识符就是一个名字,就好像我们每个人都有属于自己的名字,它的主要作用就是作为变量.函数.类.模块以及其他对象的名称. Python 中标识符的命名不是随意的,而是要遵守一定的命令规则,比如 ...