现在非常喜欢IDEA,之前在mac 上用的eclipse 经常出现无缘无故的错误。所以转为IDEA.  不过新工具需要学习成本,手头上的项目就遇到了很多问题,现列举如下:

背景描述

在hadoop 开发时,经常在mr阶段将清洗后的数据入库到Hbase. 在这个过程中,需要编译、打jar包,然后上传到服务器,执行hadoop jar   *.jar 命令。每次清洗后需要手动4步操作。农民阿姨天生喜欢取巧,故这几天一直研究如何简化此过程。

思路描述

1.之前项目自动化打包上传都用ant ,不过是在window系统下eclipse开发的。但是在mac的IDEA中,屡次失败。总是出现如下错误信息

Exception in thread "main" java.lang.SecurityException: Invalid signature file digest for Manifest main attributes
at sun.security.util.SignatureFileVerifier.processImpl(SignatureFileVerifier.java:287)
at sun.security.util.SignatureFileVerifier.process(SignatureFileVerifier.java:240)
at java.util.jar.JarVerifier.processEntry(JarVerifier.java:317)
at java.util.jar.JarVerifier.update(JarVerifier.java:228)
at java.util.jar.JarFile.initializeVerifier(JarFile.java:348)
at java.util.jar.JarFile.getInputStream(JarFile.java:415)
at org.apache.hadoop.util.RunJar.unJar(RunJar.java:101)
at org.apache.hadoop.util.RunJar.unJar(RunJar.java:81)
at org.apache.hadoop.util.RunJar.run(RunJar.java:209)
at org.apache.hadoop.util.RunJar.main(RunJar.java:136)

.csharpcode, .csharpcode pre
{
font-size: small;
color: black;
font-family: consolas, "Courier New", courier, monospace;
background-color: #ffffff;
/*white-space: pre;*/
}
.csharpcode pre { margin: 0em; }
.csharpcode .rem { color: #008000; }
.csharpcode .kwrd { color: #0000ff; }
.csharpcode .str { color: #006080; }
.csharpcode .op { color: #0000c0; }
.csharpcode .preproc { color: #cc6633; }
.csharpcode .asp { background-color: #ffff00; }
.csharpcode .html { color: #800000; }
.csharpcode .attr { color: #ff0000; }
.csharpcode .alt
{
background-color: #f4f4f4;
width: 100%;
margin: 0em;
}
.csharpcode .lnum { color: #606060; }

网上搜了很多资料,解决方案都是失败。原因是 项目中引用的jar包签名有问题。(之前用eclipse导出jar包没有出现过这种情况)

2.用IDEA导出jar包,操作步骤如下
Fiel –>Project Structure

然后出现如下:

选好Manifest File ,Main Class

之后Build->Build Artifacts 可以编译打包了。去项目中 out 文件下可以找到。

但是通过这种打包方式,在运行时报错。因为hbase的lib相关包没有包含进去。这个一种笨的解决方案是在hadoop 中每个节点上的lib包下都拷贝一份hbase的lib包。这种方案有弊端,我没有选择。如何解决这个问题?我用maven 打包,上传jar包,运行,成功。

最终方案:

通过maven可以正常打包,那如何自动化上传并运行呢?我是用maven 打包,用ant 上传并执行的。

maven  pom 文件内容:

<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion> <groupId>groupId</groupId>
<artifactId>antdemo</artifactId>
<version>1</version> <dependencies>
<dependency>
<groupId>org.apache.hbase</groupId>
<artifactId>hbase-client</artifactId>
<version>0.98.15-hadoop2</version>
</dependency> <dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-client</artifactId>
<version>2.6.0</version>
</dependency>
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-mapreduce-client-core</artifactId>
<version>2.6.0</version>
</dependency> <dependency>
<groupId>org.apache.hbase</groupId>
<artifactId>hbase-common</artifactId>
<version>0.98.15-hadoop2</version>
</dependency> <dependency>
<groupId>org.apache.hbase</groupId>
<artifactId>hbase-server</artifactId>
<version>0.98.15-hadoop2</version>
</dependency> <dependency>
<groupId>com.jcraft</groupId>
<artifactId>jsch</artifactId>
<version>0.1.51</version>
</dependency>
<dependency>
<groupId>org.apache.ant</groupId>
<artifactId>ant-jsch</artifactId>
<version>1.9.5</version>
</dependency> </dependencies>
<build>
<plugins>
<plugin>
<artifactId>maven-assembly-plugin</artifactId>
<configuration>
<archive>
<manifest>
<!--这里要替换成jar包main方法所在类-->
<mainClass>HBaseImport</mainClass>
</manifest>
</archive>
<descriptorRefs>
<descriptorRef>jar-with-dependencies</descriptorRef>
</descriptorRefs>
</configuration>
<executions>
<execution>
<id>make-assembly</id> <!-- this is used for inheritance merges -->
<phase>package</phase> <!-- 指定在打包节点执行jar包合并操作 -->
<goals>
<goal>single</goal>
</goals>
</execution>
</executions>
</plugin>
</plugins>
</build> </project>

.csharpcode, .csharpcode pre
{
font-size: small;
color: black;
font-family: consolas, "Courier New", courier, monospace;
background-color: #ffffff;
/*white-space: pre;*/
}
.csharpcode pre { margin: 0em; }
.csharpcode .rem { color: #008000; }
.csharpcode .kwrd { color: #0000ff; }
.csharpcode .str { color: #006080; }
.csharpcode .op { color: #0000c0; }
.csharpcode .preproc { color: #cc6633; }
.csharpcode .asp { background-color: #ffff00; }
.csharpcode .html { color: #800000; }
.csharpcode .attr { color: #ff0000; }
.csharpcode .alt
{
background-color: #f4f4f4;
width: 100%;
margin: 0em;
}
.csharpcode .lnum { color: #606060; }

ant 中的 build.xml的内容:

<?xml version="1.0" encoding="UTF-8"?>

<project name="项目名称" basedir="." default="sshexec">
<description>本配置文件供ANT编译项目、自动进行单元测试、打包并部署之用。</description>
<description>默认操作(输入命令:ant)为编译源程序并发布运行。</description> <!--属性设置-->
<property environment="env" />
<!--<property file="build.xml" />-->
<property name="build.dir" location="build.xml"/>
<property name="src.dir" value="${basedir}/src" />
<!--<property name="java.lib.dir" location="lib"/>-->
<property name="java.lib.dir" value="/Library/Java/JavaVirtualMachines/jdk1.7.0_79.jdk/Contents/Home/lib" />
<property name="classes.dir" value="${basedir}/classes" />
<property name="dist.dir" value="${basedir}/dist" />
<property name="third.lib.dir" value="/Users/zzy/Kclouds/01_mac/doc/workspace/other_workspace/hbase-0.98.8-hadoop2/lib" />
<!--<property name="third.lib.dir" value="{basedir}/lib" />--> <property name="localpath.dir" value="${basedir}" />
<property name="remote.host" value="192.168.122.211"/>
<property name="remote.username" value="root"/>
<property name="remote.password" value="lg"/>
<property name="remote.home" value="/test"/>
<!--每次需要知道的main类,写到这里-->
<property name="main.class" value="HBaseImport"/> <!-- 基本编译路径设置 --> <path id="compile.classpath">
<fileset dir="${java.lib.dir}">
<include name="tools.jar" />
</fileset>
<fileset dir="${third.lib.dir}">
<include name="*.jar"/>
</fileset> </path> <!-- 运行路径设置 -->
<path id="run.classpath">
<path refid="compile.classpath" />
<pathelement location="${classes.dir}" />
</path>
<!-- 清理,删除临时目录 -->
<target name="clean" description="清理,删除临时目录">
<!--delete dir="${build.dir}" /-->
<delete dir="${dist.dir}" />
<delete dir="${classes.dir}" />
<echo level="info">清理完毕</echo> </target>
<!-- 初始化,建立目录,复制文件 -->
<target name="init" depends="clean" description="初始化,建立目录,复制文件">
<mkdir dir="${classes.dir}" />
<mkdir dir="${dist.dir}" />
</target>
<!-- 编译源文件-->
<target name="compile" depends="init" description="编译源文件">
<javac srcdir="${src.dir}" destdir="${classes.dir}" source="1.7" target="1.7" includeAntRuntime="false" debug="false" verbose="false">
<compilerarg line="-encoding UTF-8 "/>
<classpath refid="compile.classpath" />
</javac>
</target>
<!-- 打包类文件 -->
<target name="jar" depends="compile" description="打包类文件">
<jar jarfile="${dist.dir}/jar.jar">
<fileset dir="${classes.dir}" includes="**/*.*" />
</jar>
</target>
<!--上传到服务器
**需要把lib目录下的jsch-0.1.51拷贝到$ANT_HOME/lib下,如果是Eclipse下的Ant环境必须在Window->Preferences->Ant->Runtime->Classpath中加入jsch-0.1.51。
-->
<!--<target name="ssh" depends="jar">-->
<!--<scp file="${dist.dir}/jar.jar" todir="${remote.username}@${remote.host}:${remote.home}" password="${remote.password}" trust="true"/>-->
<!--</target>-->
<!-- -->
<!--<target name="sshexec" depends="ssh">-->
<!--<sshexec host="${remote.host}" username="${remote.username}" password="${remote.password}" trust="true" command="source /etc/profile;hadoop jar ${remote.home}/jar.jar ${main.class}"/>-->
<!--&lt;!&ndash;<sshexec host="${remote.host}" username="${remote.username}" password="${remote.password}" trust="true" command="source /etc/profile;touch what.text ${main.class}"/>&ndash;&gt;-->
<!--</target>--> <target name="ssh" depends="jar">
<scp file="${basedir}/target/antdemo-1-jar-with-dependencies.jar" todir="${remote.username}@${remote.host}:${remote.home}" password="${remote.password}" trust="true"/>
</target> <target name="sshexec" depends="ssh">
<sshexec host="${remote.host}" username="${remote.username}" password="${remote.password}" trust="true" command="source /etc/profile;hadoop jar ${remote.home}/antdemo-1-jar-with-dependencies.jar ${main.class}"/>
<!--<sshexec host="${remote.host}" username="${remote.username}" password="${remote.password}" trust="true" command="source /etc/profile;touch what.text ${main.class}"/>-->
</target> </project>
 
 
最关键的是:
<target name="ssh" depends="jar">
<scp file="${basedir}/target/antdemo-1-jar-with-dependencies.jar" todir="${remote.username}@${remote.host}:${remote.home}" password="${remote.password}" trust="true"/>
</target> <target name="sshexec" depends="ssh">
<sshexec host="${remote.host}" username="${remote.username}" password="${remote.password}" trust="true" command="source /etc/profile;hadoop jar ${remote.home}/antdemo-1-jar-with-dependencies.jar ${main.class}"/>
<!--<sshexec host="${remote.host}" username="${remote.username}" password="${remote.password}" trust="true" command="source /etc/profile;touch what.text ${main.class}"/>-->
</target>

.csharpcode, .csharpcode pre
{
font-size: small;
color: black;
font-family: consolas, "Courier New", courier, monospace;
background-color: #ffffff;
/*white-space: pre;*/
}
.csharpcode pre { margin: 0em; }
.csharpcode .rem { color: #008000; }
.csharpcode .kwrd { color: #0000ff; }
.csharpcode .str { color: #006080; }
.csharpcode .op { color: #0000c0; }
.csharpcode .preproc { color: #cc6633; }
.csharpcode .asp { background-color: #ffff00; }
.csharpcode .html { color: #800000; }
.csharpcode .attr { color: #ff0000; }
.csharpcode .alt
{
background-color: #f4f4f4;
width: 100%;
margin: 0em;
}
.csharpcode .lnum { color: #606060; }

这步是上传并执行,好好研究一下这步代码。

MR清洗后入库Hbase 代码:

 

.csharpcode, .csharpcode pre
{
font-size: small;
color: black;
font-family: consolas, "Courier New", courier, monospace;
background-color: #ffffff;
/*white-space: pre;*/
}
.csharpcode pre { margin: 0em; }
.csharpcode .rem { color: #008000; }
.csharpcode .kwrd { color: #0000ff; }
.csharpcode .str { color: #006080; }
.csharpcode .op { color: #0000c0; }
.csharpcode .preproc { color: #cc6633; }
.csharpcode .asp { background-color: #ffff00; }
.csharpcode .html { color: #800000; }
.csharpcode .attr { color: #ff0000; }
.csharpcode .alt
{
background-color: #f4f4f4;
width: 100%;
margin: 0em;
}
.csharpcode .lnum { color: #606060; }

package demo;

import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.hbase.client.Put;
import org.apache.hadoop.hbase.mapreduce.TableMapReduceUtil;
import org.apache.hadoop.hbase.mapreduce.TableOutputFormat;
import org.apache.hadoop.hbase.mapreduce.TableReducer;
import org.apache.hadoop.io.LongWritable;
import org.apache.hadoop.io.NullWritable;
import org.apache.hadoop.io.Text;
//import org.apache.hadoop.mapred.FileInputFormat;
import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.mapreduce.Mapper;
import org.apache.hadoop.mapreduce.Reducer;
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
import org.apache.hadoop.mapreduce.lib.input.TextInputFormat; import java.io.IOException;
import java.text.SimpleDateFormat;
import java.util.Date; /**
* Created by zzy on 15/11/17.
*/ /**
* mr 中操作hbase
*/
public class HBaseImport {
static class BatchImportMapper extends Mapper<LongWritable,Text,LongWritable,Text>{
@Override
protected void map(LongWritable key, Text value, Context context) throws IOException, InterruptedException {
super.map(key, value, context);
String line = value.toString();
String[] splited = line.split("\t");
SimpleDateFormat sdf = new SimpleDateFormat("yyyyMMddHHmmss");
String time =sdf.format(new Date(Long.parseLong(splited[0].trim()))); String rowkey = splited[1]+"_"+time;
Text v2s = new Text();
v2s.set(rowkey+"\t" +line);
context.write(key,v2s); }
}
static class BatchImportReducer extends TableReducer<LongWritable,Text,NullWritable> {
private byte[] family = "cf".getBytes(); @Override
protected void reduce(LongWritable key, Iterable<Text> v2s, Context context) throws IOException, InterruptedException {
// super.reduce(key, values, context);
for (Text v2 :v2s){
String [] splited = v2.toString().split("\t");
String rowKey = splited[0];
Put put = new Put(rowKey.getBytes());
put.add(family,"raw".getBytes(),v2.toString().getBytes());
put.add(family,"reportTime".getBytes(),splited[1].getBytes());
put.add(family,"msisdn".getBytes(),splited[2].getBytes());
put.add(family,"apmac".getBytes(),splited[3].getBytes());
put.add(family,"acmac".getBytes(),splited[4].getBytes());
}
}
}
private static final String tableName = "logs";
private static void main(String[] args) throws IOException, ClassNotFoundException, InterruptedException {
Configuration conf = new Configuration();
conf.set("hbase.zookeeper.quorum","192.168.122.213:2181");
conf.set("hbase.rootdir","hdfs://192.168.122.211:9000/hbase");
conf.set(TableOutputFormat.OUTPUT_TABLE, tableName); Job job = Job.getInstance(conf,HBaseImport.class.getSimpleName()); TableMapReduceUtil.addDependencyJars(job);
job.setJarByClass(HBaseImport.class);
job.setMapperClass(BatchImportMapper.class);
job.setReducerClass(BatchImportReducer.class); job.setMapOutputKeyClass(LongWritable.class);
job.setMapOutputValueClass(Text.class); job.setInputFormatClass(TextInputFormat.class);
job.setOutputFormatClass(TableOutputFormat.class); FileInputFormat.setInputPaths(job, "hdfs://192.168.122.211:9000/user/hbase");
job.waitForCompletion(true);
// FileInputFormat.setInputPaths(job,""); }
}

.csharpcode, .csharpcode pre
{
font-size: small;
color: black;
font-family: consolas, "Courier New", courier, monospace;
background-color: #ffffff;
/*white-space: pre;*/
}
.csharpcode pre { margin: 0em; }
.csharpcode .rem { color: #008000; }
.csharpcode .kwrd { color: #0000ff; }
.csharpcode .str { color: #006080; }
.csharpcode .op { color: #0000c0; }
.csharpcode .preproc { color: #cc6633; }
.csharpcode .asp { background-color: #ffff00; }
.csharpcode .html { color: #800000; }
.csharpcode .attr { color: #ff0000; }
.csharpcode .alt
{
background-color: #f4f4f4;
width: 100%;
margin: 0em;
}
.csharpcode .lnum { color: #606060; }

.csharpcode, .csharpcode pre
{
font-size: small;
color: black;
font-family: consolas, "Courier New", courier, monospace;
background-color: #ffffff;
/*white-space: pre;*/
}
.csharpcode pre { margin: 0em; }
.csharpcode .rem { color: #008000; }
.csharpcode .kwrd { color: #0000ff; }
.csharpcode .str { color: #006080; }
.csharpcode .op { color: #0000c0; }
.csharpcode .preproc { color: #cc6633; }
.csharpcode .asp { background-color: #ffff00; }
.csharpcode .html { color: #800000; }
.csharpcode .attr { color: #ff0000; }
.csharpcode .alt
{
background-color: #f4f4f4;
width: 100%;
margin: 0em;
}
.csharpcode .lnum { color: #606060; }

本文章如何对阁下有帮助,希望您高举贵手,支持一下。

Mac 下用IDEA时maven,ant打包 (mr 入库hbase)的更多相关文章

  1. 解决Mac下AndroidStudio内容时卡顿

    Mac下AndroidStudio在写代码的时候出现卡顿,小圆圈会一直转,此时我们应该检查下AndroidStudio的内存使用情况了. 1.点击左上角 AndroidStudio -- Prefer ...

  2. 解决mac下安装yeoman时没有权限问题

    在mac下安装yeoman经常会出现如下图错误: 解决办法:在命令行执行-- sudo chown -R $USER /usr/local/lib/node_modules 回车就OK

  3. Mac下配置Idea的Maven

    环境版本: Mac OS: 10.13.4 JDK: 1.8 Idea: 2018.3 Maven: 3.6.0 Maven 相关配置: Maven 下载: http://maven.apache.o ...

  4. 【Vegas原创】MAC下,idea手动maven jar包的方法

    1,到自己的项目目录下 Vegass-MacBook-Air:gms-boyol Vegas$ pwd/Users/Vegas/SynologyDrive/Coding/workspace/gms-b ...

  5. Mac下安装和配置Maven

    1.下载Maven 官网:http://maven.apache.org/download.cgi 下载版本:apache-maven-3.5.3-bin.tar.gz 2.配置环境变量 打开term ...

  6. mac下配置eclipse的maven环境

    转自:http://www.cnblogs.com/yqskj/archive/2013/03/30/2990292.html 1.下载maven的bin包,解压,配置到环境变量里面去 1). 首先到 ...

  7. mac下的应用程序发布 及 打包(Python写的脚本,可打包第三方库)

    其实这个问题在网上能搜到大把的解决方案.大家的统一答案都是 otool -L yourapp.app/Contents/MacOS/yourapp 根据输出信息在运行 install_name_too ...

  8. Mac下eclipse 启动时出现An error has occurred. See the log file的问题

    eclipse原来可以使用的好好的,装了多个版本的jdk后,打开eclipse出现An error has occurred. See the log file的问题,经过查找,可能原因之一是机子装了 ...

  9. Mac下通过homebrew安装maven

    1.安装Homebrew 将以下命令粘贴至终端 /usr/bin/ruby -e "$(curl -fsSL https://raw.githubusercontent.com/Homebr ...

随机推荐

  1. Loadrunner在post请求的cookie中插入其它多个值

    有一个商城,提交订单的时候,需要在post请求的cookie中set一个code码,便于防刷 tijiao() { lr_think_time(); web_reg_save_param(" ...

  2. 生活中的MVC模式,一个吃货的理解。

    以下是生活中对于MVC模式的领悟,虽然可笑,轻喷. 2015年 8月 26日 M  => Model       模型 我认为叫做模具更好的理解.批量加工生产具有相同特征的东西.        ...

  3. mysql forget root password

    http://www.rackspace.com/knowledge_center/article/mysql-resetting-a-lost-mysql-root-password MySQL - ...

  4. Tomcat_Java Web_内存溢出总结

    在抛出内存溢出错误的时候,一般都会提示内存泄露的种类,一般也都是按照区域进行划分: 1. 堆(heap)内存泄漏java.lang.OutOfMemoryError: Javaheap space:大 ...

  5. HTML1

    1.  前端:html:做网页的内容 CSS: 做网页的外观 JS:执行网页的动作 前端是静态网页,内容写死.要有变化的内容,需要数据库 浏览器去解析生成漂亮的界面 后台技术:.Net C# 数据库: ...

  6. openssl生成https证书 (转)

    1.首先要生成服务器端的私钥(key文件):openssl genrsa -des3 -out server.key 1024运行时会提示输入密码,此密码用于加密key文件去除key文件口令的命令:o ...

  7. webstorm9.3 安装less 编译css教程

    第一步:安装node.js webstrom9.3汉化下载:http://pan.baidu.com/s/1ntGNNiL    密码:1fbh node.js 下载地址:https://nodejs ...

  8. CentOS7安装hive-2.1.0

    环境: CentOS7 Hadoop-2.6.4,配置两个节点:master.slave1 mysql-server 过程: 下载.解压hive-2.1.0到/usr/hadoop-2.6.4/thi ...

  9. ORACLE ORA-01157: 无法标识/锁定数据文件

    create undo tablespace MOZI datafile 'E:\oracle\product\10.2.0\oradata\orcl\MOZI.DBF' size 2048M ext ...

  10. redis主从复制操作

    1. 33.10服务器上 启动三个redis bin/redis-server etc/redis.conf bin/redis-server etc/6380conf bin/redis-serve ...