使用MapReduce将HDFS数据导入到HBase（二）

package com.bank.service;

import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.conf.Configured;
import org.apache.hadoop.hbase.client.Put;
import org.apache.hadoop.hbase.mapreduce.TableMapReduceUtil;
import org.apache.hadoop.hbase.mapreduce.TableOutputFormat;
import org.apache.hadoop.hbase.mapreduce.TableReducer;
import org.apache.hadoop.hbase.util.Bytes;
import org.apache.hadoop.io.LongWritable;
import org.apache.hadoop.io.NullWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.mapreduce.Mapper;
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
import org.apache.hadoop.util.GenericOptionsParser;
import org.apache.hadoop.util.Tool;
import org.apache.hadoop.util.ToolRunner;

/**
* 使用MapReduce批量导入Hbase
* 通过TableOutputFormat，该类内部传给指定的Put实例并调用table.put()方法。作业结束前会主动调用flushCommits()方法保存仍在写缓冲区的数据
*
* @author mengyao
*
*/
public class CnyBatch extends Configured implements Tool {

static class CnyBatchMapper extends Mapper<LongWritable, Text, LongWritable, Text> {
       protected void map(LongWritable key, Text value, Context context)
               throws java.io.IOException, InterruptedException {
           context.write(key, value);
       }
   }

static class CnyBatchReduce extends TableReducer<LongWritable, Text, NullWritable> {
       private final static String familyName = "info";
       private final static String[] qualifiers = {"gzh", "currency", "version", "valuta", "qfTime", "flag", "machineID"};
       @Override
       protected void reduce(LongWritable key,
               java.lang.Iterable<Text> value, Context context)
               throws java.io.IOException, InterruptedException {
           final String[] values = value.toString().split("\t");
           if (values.length == 7 && values.length == qualifiers.length) {
               final String row = values[0]+"_"+values[1]+"_"+values[2]+"_"+values[3];
               long timestamp = System.currentTimeMillis();
               Put put = new Put(Bytes.toBytes(row));
               for (int i = 0; i < values.length; i++) {
                   String qualifier = qualifiers[i];
                   String val = values[i];
                   put.add(Bytes.toBytes(familyName), Bytes.toBytes(qualifier), timestamp, Bytes.toBytes(val));
               }
               context.write(NullWritable.get(), put);
           } else {
               System.err.println(" ERROR: value length must equale qualifier length ");
           }
       };
   }

@Override
   public int run(String[] arg0) throws Exception {
       Job job = Job.getInstance(getConf(), CnyBatch.class.getSimpleName());
       TableMapReduceUtil.addDependencyJars(job);
       job.setJarByClass(CnyBatch.class);

       FileInputFormat.setInputPaths(job, arg0[0]);
       job.setMapperClass(CnyBatchMapper.class);
       job.setMapOutputKeyClass(LongWritable.class);
       job.setMapOutputValueClass(Text.class);

       job.setReducerClass(CnyBatchReduce.class);
       job.setOutputFormatClass(TableOutputFormat.class);


       return job.waitForCompletion(true) ? 0 : 1;
   }

public static void main(String[] args) throws Exception {
       Configuration conf = new Configuration();
       conf.set("hbase.zookeeper.quorum", "h5:2181,h6:2181,h7:2181");
       conf.set("hbase.zookeeper.property.clientPort", "2181");
       conf.set("dfs.socket.timeout", "100000");
       String[] otherArgs = new GenericOptionsParser(args).getRemainingArgs();
       if (otherArgs.length != 2) {
           System.err.println(" ERROR: <dataInputDir> <tableName>");
           System.exit(2);
       }
       conf.set(TableOutputFormat.OUTPUT_TABLE, args[1]);
       int status = ToolRunner.run(conf, new CnyBatch(), args);
       System.exit(status);
   }
}

使用MapReduce将HDFS数据导入到HBase（二）的更多相关文章

使用MapReduce将HDFS数据导入到HBase（一）
package com.bank.service; import java.io.IOException; import org.apache.hadoop.conf.Configuration;im ...
使用MapReduce将HDFS数据导入到HBase（三）
使用MapReduce生成HFile文件,通过BulkLoader方式(跳过WAL验证)批量加载到HBase表中 package com.mengyao.bigdata.hbase; import j ...
使用MapReduce将HDFS数据导入Mysql
使用MapReduce将Mysql数据导入HDFS代码链接将HDFS数据导入Mysql,代码示例 package com.zhen.mysqlToHDFS; import java.io.DataI ...
使用MapReduce将mysql数据导入HDFS
package com.zhen.mysqlToHDFS; import java.io.DataInput; import java.io.DataOutput; import java.io.IO ...
用mapreduce读取hdfs数据到hbase上
hdfs数据到hbase过程将HDFS上的文件中的数据导入到hbase中实现上面的需求也有两种办法,一种是自定义mr,一种是使用hbase提供好的import工具 hbase先创建好表 cre ...
HBase(三): Azure HDInsigt HBase表数据导入本地HBase
目录: hdfs 命令操作本地 hbase Azure HDInsight HBase表数据导入本地 hbase hdfs命令操作本地hbase: 参见 HDP2.4安装(五):集群及组件安装 , ...
将Excel中数据导入数据库（二）
在上篇文章中介绍到将Excel中数据导入到数据库中,但上篇文章例子只出现了nvachar类型,且数据量很小.今天碰到将Excel中数据导入数据库中的Excel有6419行,其中每行均有48个字段,有i ...
HBase结合MapReduce批量导入（HDFS中的数据导入到HBase）
HBase结合MapReduce批量导入 package hbase; import java.text.SimpleDateFormat; import java.util.Date; import ...
把hdfs数据写入到hbase表
功能:把hdfs上的数据写入到hbase表. hadoop的mapreduce输出要导入到hbase表,最好先输出HFile格式,再导入hbase,因为HFile是hbase的内部存储格式,所以导入效 ...

随机推荐

LVM命令摘要
命令描述物理卷(PV) pvcreate 创建LVM磁盘 #pvcreate /dev/sdb pvdisplay 显示卷组中的物理卷信息 pvchange 设置PV的性能,允许或拒绝 ...
EF code first 生成edmx文件
通过下面的代码,你就可以拿到EF心中的地图 —— edmx文件. using (var context = new Context()) { XmlWriterSettings settings = ...
wxpython下的桥梁信息管理系统
github地址:https://github.com/billiepander/BIMS 第一版: 现在实现了登陆,与部门级别用户录入桥梁检测信息后保存为excel(后期要用数据库存一些关键信息,为 ...
How to Install Tomcat
Read:http://www.ntu.edu.sg/home/ehchua/programming/howto/Tomcat_HowTo.html
c-参数(argument)
In C, array arguments behave as though they are passed by reference, and scalar variables and cons ...
Cacti监控Windows主机，Windows主机的正确配置
使用cacti监控Windows主机的时候经常遇到无法获取Windows主机的snmp信息和Windows主机的硬件信息,主要原因是Windows主机没有正确配置snmp,以下是正确的配置步骤:1.安 ...
javascript代码放置位置对程序的影响
在编写html文档时,javascript可以放置的位置有两个地方<head>或者<body>,但是放置的地方,会对 JavaScript 代码的正常执行会有一定影响.由于 H ...
oracle中存储过程详解
oracle中存储过程的使用过程是指用于执行特定操作的PL/SQL块.如果客户应用经常需要执行特定操作,那么可以考虑基于这些操作建立过程.通过使用过程,不仅可以简化客户应用的开发和维护,而且可以提高 ...
jq仿虾米网flash效果
这是很久以前写的一个效果了,之前虾米音乐网首页的一个flash效果,最初觉得这flash效果也可以完全用jq来写,于是空余时间就写了下当作练习吧,现在就拿出来跟大家分享下其中的实现原理! 先上最终效果 ...
JQUERY1.9学习笔记之基本过滤器(六) 页眉选择器
页眉选择器jQuery( ":header" ) 描述:选择页眉的所有标签,如 h1,h2, h3 等. <!DOCTYPE html><html lang=&q ...

使用MapReduce将HDFS数据导入到HBase（二）

使用MapReduce将HDFS数据导入到HBase（二）的更多相关文章

随机推荐

热门专题